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ABSTRACT 


Printed matter (text or typewriting) as a two-dimensional, two- 
valued, stochastic process, is studied to determine how its statistical 
properties vary with resolution. Spatial resolution is digital, with 
the document image dissected into small rectangular scan ” elements. M 
Quantizing is two-level with a scan element defined as either black or 
white, using a 50 percent decision level after integration over the 
element area. 

The tradeoff between document quality and document entropy is ex- 
plored by taking each factor separately and modeling its dependence on 
spatial frequency. Since these factors really depend on the ratio be- 
tween resolution and character size, all results are reported in terms 
of normalized spatial frequency. Stroke width is used to express char- 
acter size in this normalization. 

Legibility of characters, out-of -context , is used as the document 
quality measure for printed matter. A piece-wise linear model for the 
dependence of legibility on resolution is postulated, based on concepts 
from the sampling theorem. The recovery of character-strokes when under- 
sampled is assumed proportional to the intelligible recovery of charac- 
ters (legibility). An empirical model is also fitted to the data, as it 
varies over combinations of two-dimensional resolution. 

A succession of increasingly complex source alphabets are examined 
for encoding the document image. One- and two-dimensional alphabets are 
tried out before exploring resolution variation for the better ones. 
Horizontal and vertical resolution are kept equal for these experiments. 
An empirical model is derived to express the behavior of document entropy 
with resolution for the various alphabets. Resolution efficiency is de- 
fined in order to compare the compression achieved at a given resolution 
with compression at the Nyquist interval for character-strokes. This 
measure is proposed in order to evaluate the high compression values 
obtained in using sampling rates much higher than the Nyquist rate. 
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Chapter I 


INTRODUCTION 


1.1. Background 

In the two decades since Shannon's fundamental contribution 
[Shannon - 1948], considerable effort has been expended in applying 
information theory to a variety of problem areas. The reduction of 
redundancy in digitized images is but one such application. More re- 
cently, industry has been grappling with the practical implementation 
of these ideas giving rise to new questions and problem areas. 

For example, office copiers have developed a vast market in 
the business world; the use of machines for handling business documents 
has become commonplace. A natural extension is the storage and forward- 
ing of document images. To be sure, facsimile has been around for a 
long time. But only recently has a large market and new technology com- 
bined to make sophisticated image processing systems feasible. Applying 
information theory to the processing of digital images provides the capa- 
bility to encode for compression or noise immunity. Regeneration of the 
image is possible in the face of many processing steps that might other- 
wise cause image degradation. 

The design of such systems requires more than the theoretical 
concepts of entropy, redundancy reduction, and compression codes. Sta- 
tistics characterizing typical documents are required to design matching 
compression codes. In this case, "representative 11 documents must be 
defined. For such representative documents the statistics are dependent 
on the resolution of the digitizing process. Resolution also impacts the 
"quality" of a document image. This dependence on resolution, and the 
resulting tradeoffs for document coding and document quality, are the 
major topics of this study. 

Typically, empirical results in the literature are hard to 
extend or generalize quantitatively. One must subjectively judge quality 
using the arbitrary image of a face, crowd, etc. , for comparison before 
and after some processing step. Compressions using a code are reported 
for a specific resolution, without feel for its dependency on the 
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resolution chosen. Extrapolation of these results to other documents 
or resolutions can be made only qualitatively. 

What is presented here is a quantitative exploration of docu- 
ment quality and compression. Concentration on print is motivated by 
the preponderance of it in the images of business documents. Also, the 
emphasis on printed matter and the nature of processing in contemporary 
copying machines, has caused me to restrict my attention to two-level 
image processing (as contrasted with M grey-scale") . 

1.2. Summary 

Two measures for the information inherent in printed documents 
— legibility and entropy — have been explored here, in an attempt to 
characterize their behavior with resolution. Though philosophically 
similar their characteristics seemed not to overlap, with mechanical 
measurements of entropy tending only to approximate crudely the infor- 
mation discerned by human observers. 

In Chapter II, the legibility of alphanumeric characters is 
used as a measure for document quality. The physical recovery of char- 
acter structure is proposed as a major factor determining the behavior 
of legibility with decreasing resolution. A piece-wise linear model is 
presented for the recovery of characters, based on undersampling of the 
strokes that constitute their structure (Fig. 2.1-4). Legibility mea- 
surements for the same documents are fitted with an analytic relation- 
ship, expressed in terms of spatial frequency over both dimensions 
(Eq. 2.2-18). This empirical model has been normalized with respect to 
character size and appears to have the same shape for differing type 
fonts. 

Entropy is used in Chapter III to measure the performance of 
successively complex coding schemes. Resolution is varied here over 
the same spatial frequencies used in Chapter II. Both one- and two- 
dimensional codes are explored, in an attempt to understand the mecha- 
nisms underlying compression for the images of printed matter. 

From the insight gained, a model for document entropy is con- 

* 

structed based on page size, character size and density, and resolution 


SEL-68-102 


2 



(Eq. 3.4-20). For a particular type font this model fitted to data ap- 
pears to be independent of the actual distribution of characters in the 
sample. 

A definition is proposed for resolution efficiency, relative 
to the Nyquist rate for the character structure (Eq. 3.4-23) . This is 
called £*-efficiency and embodies all the resolution variation. It has 
been used to define simple expressions for €*- symbol entropy, e*-symbol 
compression, and €*-page entropy (Eq. 3.4-25, 26, and 27). Also, the 
ability of a scan pattern to handle varying character sizes is specified 
by €*-ef ficiency. 

Possible extensions to this work are noted in Chapter IV. 
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Chapter II 


THE THRESHOLD OF LEGIBILITY WITH SPATIAL FREQUENCY 

A useful measure for image quality in printed matter is how well it 
can be read. Business documents add a constraint to this, in that context 
may not always be helpful in deciphering an unclear image. Financial state- 
ments or part number listings, for example, must preserve their character 
integrity throughout any image processing. Therefore, out-of-context char- 
acter "legibility" is used here for print quality specification. 

Historically, legibility has been used as a measure of quality to in- 
vestigate the effects of type font, stroke width, and other parameters of 
printed characters. More recently legibility has been used to measure 
quality in various kinds of visual displays. Many of the basic investiga- 
tions of print are summarized in Legibility of Print [Tinker - 1963]. The 
extent of current literature on display quality is indicated in recent 
bibliographies [Cornog and Rose - 1967], and [Shurtleff - 1967], 

My interest here lies in the effects of resolution on the digitized 
images of printed matter. Attention has been restricted to rectangular 
resolution elements variable in both dimensions. Parameters dependent on 
resolution are specified over a "resolution 'plane" where vertical and 
horizontal resolution are the axes. 

% 2.1. Theoretical Model Based on the Sampling Theorem 

An example of throughput for the experimental system (Appendix 
A. 1) can be seen in Fig. 2.1-1, where a "before" and "after" microphoto- 
graph of the letter K is shown. The original type font, which is called 
Mid-Century, was selected for its clean strokes which are not adorned with 
serifs. The particular character illustrated came from the word THINK in 
document At (Appendix A. 2) . The digitizing process is readily evident in 
the reproduced character, which has been built up from small black or white 
rectangles. 
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Fig. 2.1-1. CHARACTER BEFORE AND AFTER DIGITAL PROCESSING 

(20X MAGNIFICATION) . 

The scanning process is illustrated in Fig. 2.1-2, where an 
idealized K is displayed superimposed with a rectangular scan element. 
As the area-integrating element moves in the direction of scan, it is 
periodically sampled and classified as either black or white. 

The following dimensions are also shown in Fig. 2.1-2: 

w - stroke width mils 

s 

1 = stroke length (for a straight stroke) mils 

s 

d = scan element dimension in the scanning 

direction mils 

d = scan element dimension perpendicular to 
^ the scanning direction mils 

Angles are all defined relative to the abscissa in a rectangular x,y 
coordinate system: 

6 - stroke angle degrees 

s 

0 ^ == element scan direction degrees 
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The following functions [Goodman - 1968] will also be useful in 
describing the scanning process. 


Comb function: 


00 

r" 






comb 

(x) = y 

5 (x - n) , 

where 

5(x) 

is the 



— oo 

r 

Dirac 

delta 

function 

Rectangle 



, , 1 




function: 


a, 

x < 2 





rect 

(X) ={ 

, , 1 






Ip. 

M > 2 




Sine function: 

sine 

sin (it x) 

W ” 3tX 





2,1.1. Transfer function of the digital image-processing system 
For a general model of document digitizing, define: 


g d (x,y) = document reflectivity function 


a(x,y) = aperture function of the scan element 
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Scanning results in a two dimensional-convolution between the two: 

g e (x,y) = scan element output 

= g d (x,y) * a(x,y) (2.1-1) 

For digital image processing, the scanner output must be sampled: 

g.(x,y) = sampler output 
s 

= g e (x,y) comb ^ j comb (2.1-2) 

where the x and y sampling intervals are X and Y respectively. 
Finally, analog-to-digital (A/D) conversion introduces a nonlinear opera- 
tion that I symbolize with a subscripted C-bracket: 

g c (x,y) = A/D converter output 

= C{g s (x,y)} i (2.1-3) 

where, Z, is the number of grey levels used in the conversion. 

Signal restoration can be represented in the spatial fre- 
quency domain by multiplication with a filter characteristic: 

H(f , f ) = filter spatial, frequency response 

X Y 

or one can equivalently use convolution in the space domain: 

h(x,y) = filter spatial response 

The restored document image then becomes: 

g d (x,y) = D/A converter output 

= g(x,y)* h(x,y) (2. 1-4) 

c 
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Combining Eq. 2. 1-1, 2, 3, and 4 produces the glorious result: 

g d (x,y) = c|[g d (x,y)*a(x,y> ] |comb(^comb^j j- * h(x,y) (2.1-5) 

Case A. rect filtering for analog sampling 

Classically, in the Whittaker- Shannon sampling theorem 
[Whittaker - 1915], [Shannon - 1949] one has: 

G d (f X’ V = °’ f ° r f X > 2 l> f Y > ^ (band-limited) 

a(x,y) = S(x, y) 

H(f X’ f Y ) = rect (Xf X ) rect (Y V 
h(x, y) = ^ sinc^j sinc^j 

£ -» “ (analog) 

where the two-dimensional Dirac delta function is defined as [Goodman - 
1968]: 

5(x,y) = N 2 exp [-N 2 jt(x 2 + y 2 )] 

and the use of generalized Fourier transforms is assumed throughout. (For 
a detailed discussion of delta functions, refer also to [Bracewell - 1965] . ) 

Using the sifting property of delta functions in the con- 
volution of Eq. 2.1-1 yields: 

g g (x, y) = g d (x,y)*5(x,y) = g d (x,y) (2.1-la) 

The A/D conversion of Eq. 2.1-3 does not apply: 

g (x,y) = g (x, y) (2. 1-3 a) 

c s 

so that combination of Eq. 2. 1-la, 2, and 3a yields: 

e c (x,y) = g d (x,y) comb ^ comb (£} (2.1-6) 
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In the spatial frequency domain, the following identity 
holds for Eq. 2.1-4 with the given filter: 

G d (f x > V = G c (f x'V rect < Xf x> rect(Yf y ) = G d (f x , f y ) (2.1-4a) 

assuming that G d (f x> f y ) is band-limited: 

G d^X’ ~ G ’ 

Since the expanded form of the 

00 

G c^’ f Y J = X 

n=-oo 

the band-limiting assumed for 

filter, and only the replication of G, (f , f ) passes through at the 

u a y 

origin: 

G d (f X ’V = G d (f X’V (2.1-58) 

Case B. sine filtering for digital sampling 

In the system used here (Appendix A. 1) , samples are recon- 
structed using a two-level output printer with stair-step interpolation 
from sample to sample. In effect, a two-dimensional rect function: 

h (x, y) = rect^j rect 

is convolved with two-level samples in the space domain: 

& = 2 

g c (x,y) = c K< x .y )} 2 

g d < x » y ) = g c (x,y)*h(x,y) 


(2. l-3b) 
(2. l-4b) 


for 'jiS' f yi5 


transform in Eq. 2.1-6 is: 

00 

2 MvS-v?) 

m=-a> 


(2.1-6a) 


G. (f Y , f v ) just matches the assumed rect 

Cl A x 
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This is equivalent to filtering in the spatial frequency domain with a 
two-dimensional sine function: 

H(x,y) = XY sinc(Xf x ) sinc(Yf y ) 

The scan aperture is an identical rectangle to the above: 

a(x,y) = rect^j rect^j 

g g (x,y) = g d (x,y)*rect(^ rect (f) (2.1-lb) 

both with dimensions matching the sampling intervals: 

g g (x, y) - g e (x f y) comb | ^ comb(^ (2.1-2b) 

The resulting system description becomes: 

g d (x,y) = C^g d (x,y)*rect(|)rect(Z)] [comb(|)comb(|)]^[rect(j)rect(|)j 

(2. l-5b) 

In effect, the image is dissected by a rectangular X by Y grid 
(Eq. 2.1-lb and 2b). The elements of the grid are then deemed black or 
white, depending on the " color” predominating (Eq. v 2.1-3b). Finally, 
they are reproduced as solid-color rectangles of the same dimension 
(Eq. 2.1-4b). [Deutsch - 1957] used this criterion for two-level 
quantization. 

Notice that using space domain analysis aids in under- 
standing the physical process, In the spatial frequency domain, it is 
especially awkward to visualize the effect of two-level quantization. 

The process obviously does not achieve perfect restora- 
tion. However, area-integration does serve to eliminate "f ly-specks"— 
document noise of less than 1/2 the sample element area. The two-level 
nature of the data is also preserved. This aids in regeneration of the 
document during multiple passes through such an image-processing system. 


11 
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2.1.2. The probability of stroke recovery vs resolution 


Linear systems techniques can be applied to simplify anal- 
ysis of the generalized document. First, line drawings and characters 
can be decomposed into strokes (as defined previously in Fig. 2.1-2). 

The "impulse response" to an idealized stroke can be measured next for 
an image-processing system, and then superposition- used to extend the 
results to specific cases. (This is not truly linear because amplitude 
must be forced to one of two levels after superposition. ) 

A simplifying approximation can be made by assuming that: 

Z » d ,d mils (2.1-7) 

s s p 

That is, in practical applications the scan elements are smaller than the 
stroke length, and about the size of the stroke width. 

A further simplification can be made by considering the 
critical angle between the stroke and direction of scan. When 

|e - el =90° mils (2.1-8) 

S 6 

the stroke appears narrowest to the sweeping element. This is important 
in the case of undersampling, where: 

w < d , d mils 

s s p 

Strokes are not reproduced if they are too narrow to fill 50 percent of 
the scan element area. 

Finally, for convenience I have assumed a horizontal scan 

so that: 


e =0° (2.1-9) 

e 


d. = d = scan element dimension in the 
h s 

scanning direction 


d 


s 


= d = scan element dimension perpen- 
P dicular to the scanning 
direction 


mils 


mils 
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Figure 2. l-3a illustrates the assumptions of Eq. 2.1-7, 

8, and 9. The document reflectivity is now represented by: 

g d (K,y) = rect^Sj , 

where, Z, is a random variable with uniform distribution: 

>lf ' T' I’ 1 ft) 

-. 1-1 > ft ) 

representing the relative phase difference between the center line of 
the stroke (X ,y) and the origin (0,0). This phase difference deter- 
mines whether a stroke will be sampled by the combs of Eq. 2.1-2b cen- 
tered at the origin. 



Fig. 2.1-3. SCANNING AN 
IDEALIZED STROKE. 
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dimension: 


For simplicity, the problem can now be viewed in one 

( 2 . 2 - 10 ) 

3 / 

a(x,y) = d h (x) = rect^j (2,1-12) 


g d Cx,y) = w g (x) = rect 


(V) 


* rect^lj (2. 1-12) 

2 

The convolution between w (x) and d„ (x) is illustrated 

S 1 “ 

in Fig. 2. l-3b, c, and d. In Fig. 2. l-3a, — th of the scan element over- 

o 

laps the stroke. This value is plotted directly below, as a point of 

height i on the curve of Fig. 2. l-3d. 

The A/D conversion process is illustrated in Fig. 2. l-3d 

and e, with the values 0.5- and-above in Fig. 2. l-3d being assigned 

value "1" in Fig. 2. l-3e. Notice that the output of the A/D converter 

is a faithful version of w^Cx). This is true so long as w g > ( V /2 - 

If w < (d. )/2, the convolution will not exceed 0.5 and the stroke 
s - h 

will drop out. 

In Fig. 2. 1-3 sampling has not yet occurred. In this case, 
the order of sampling and quantizing is not important. It has been shown 
second in order to clearly demonstrate the effects of the random phase, 

Z. Since Z is uniformly distributed, a stroke will be sampled with 
probability: 

P (x) = P[recovery of stroke] = 0 , 


for w < 

s - 2 


for 


™ < w < d. 
2 s — h 


for d. < w 
h s 


(2.1-13) 


and Eq. 2.1-5b becomes: 


V 30 


= c l[ reot (^r)* rect © comb (s) 
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The probability of recovery is zero for the case of drop-out, one for 
strokes wider than the sampling interval, and linear in-between* 

Generalizing to two dimensions, a simple piece-wipe linear 
model is made assuming that the smaller element dimension predominates in 
modeling the chance of recovery. Equation 2,1-13 now becomes: 


w w 

p rs (x, y) =0.0, for ~ or -p<0.5 
„ b a h v ~ 


w w w 

, for 0.5 <-£<-£< l.O 
d. d. — d - 

h h v 


w w w 

~ , for 0.5 <-£<-£< l.O 
a a — a — 

v v n 


w w 

= 1.0, for 1.0 < -p and -p 

h v 


(2.1-14) 


In order to express the piece-wise linear ranges simultaneously for both 
dimensions, the inequalities have been rearranged. This gives stroke 
width over element size, which is dimensionless and constitutes a normali- 
zation. In other words, if: 



= horizontal spatial 


frequency 


elem. /inch 


■jp = f = vertical spatial frequency elem. /inch 
v 

then multiplying next by w (inch/stroke) yields normalized spatial 

s 

frequencies : 

ff = w f, elem. /stroke 

h s h 

f* = w f elem. /stroke 

v s v 

Using these normalized coordinates, the piecewise-linear 
model of Eq. 2.1-14 has been plotted in Fig. 2.1-4. The dotted vertical 
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Fig. 2.1-4. THEORETICAL MODEL FOR THE THRESHOLD OF LEGIBILITY. 

plane, where f* = f*, represents combinations of equal element dimen- 
sion (i. e. , square scan elements). 

Figure 2.1-4 is proposed as a theoretical model for legi- 
bility using the following superposition argument: 

i) Percent legibility of a character set is defined to 
be the average percent legibility of its individual 
characters, 

ii) The percent legibility of individual characters is 
assumed to be proportional to the percent recovery 
of their strokes. 

The assumption ii) is admittedly rough, but provides a 
first order model from which to make further refinements as insight is 
gained. 
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2.2. Empirical Model Based on Legibility Measurements 


Document A, 1 (see Appendix 2) with four sizes of Mid-Century 
type was passed through the experimental system at all 25 resolution 
combinations for both vertical and horizontal scanning. The legibility 
for these outputs was measured by associates at IBM, and the total ex- 
periment reported internally in a joint paper (Arps, et al - 1966]. 
Subsequently these results were presented externally at the 1968 IEEE 
Conference on Communications [Arps, et al - 1968]. Raw data from this 
joint effort are reported in Appendix A. 3. The subsequent contributions 
in sections 2.2.1 - 2.2.3 are my own and heretofore not published. 

2.2. 1. Normalizing with respect to stroke width 

To combine the data taken for different character sizes, 
normalization was desirable. A check of the data indicated that the re- 
sults of scaling were fairly linear. For example, scan aperture dimen- 
sion had to double when character height was doubled to achieve the same 
legibility (Fig. 2.2-1). The particular dimension to be used for normal- 
ization (character height, stroke width, etc.) appeared unimportant at 
first, since the different character sizes were photographic reductions 
of each other. Stroke width, w , was selected in order to relate re- 
suits to the theory in sections 2.1.1 and 2.1.2. 
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The resulting normalized resolution frequency combinations 
are illustrated in Fig. 2.2-2. 



Fig. 2.2-2. NORMALIZED RESOLUTION COMBINATIONS, WITH a hv = 0.0. 
They represent all the combinations of: 


and: 


f* = normalized horizontal spatial frequency 
h 



elem./stroke 


f* = normalized vertical spatial frequency 
v 



elem./stroke 
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for all combinations 

of 

V 

d , 

V 

and w with 

s 

values: 

d h = 

5.0, 

6.7, 

8.0, 

10.0, 13.3 

mils/elem. 

d = 

V 

5.0, 

6.7, 

8.0, 

10,0, 13.3 

mils/elem. 

w = 
s 

5.0, 

7.5, 

10.0, 

15.0 

mils/stroke 


The 200 combinations are grouped into 4 large overlapping 
squares. These represent the same 25 resolution settings normalized vary- 
ing amounts corresponding to the 4 character sizes on Document Al. Rays 
have been drawn from the origin to indicate data points with the same 
to f£ ratio. The circled data points along the 45° ray, represent the 
resolution combinations for square scan elements. The square-marked data 
points have the largest f* to f* ratio (marked for use in later dis- 
cussion along with the specification that a^ = 0). 

A rough feel for the data in Table A. 3-1 was obtained by 
jotting values down at corresponding resolutions in Fig. 2.2-2. When 
contours of equal legibility were sketched in, they appeared to have a 
hyperbolic structure. 


<f h- a v )<f ^-V = K *’ £or P rc [K * J=K L <2 - 2 ‘ l> 


Its asymptotes would be the lines: 


and: 



a 


v 



a 


h 


and the origin for the hyperbolas would lie at (a^a^). To preserve 
symmetry I added the constraint that: 


= a = a 


hv 


( 2 . 2 - 2 ) 


Another plot was made using only legibility values for 
square scan elements (Fig. 2.2-3). Values were taken along a 45° line 
in the resolution plane for both Table A. 3-1 and the theoretical model 
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(UPPERCASE, MID-CENTURY TYPE). 

of Fig. 2, 1-4. The theoretical model became a piece-wise linear curve 

appearing to bound the empirical data points. A sketch through the em- 

—x 

pirical points resembled a classical 1-e curve. This resemblance was 
stronger when allowance was made for the greater standard error of legi- 
bilities around 50 percent. 

The exponential structure suggested by Fig. 2.2-3 was fur- 
ther evaluated by plotting it again on semi-log paper (Fig. 2.2-4). The 
result was linear enough to justify a representation of the form: 

ln(l - P ) = - i (f* -b), for f* > b (2.2-3) 

rc c hv hv — 

or alternatively: 

P rc = 1 " exp [- k (f hv' b) } for f IU b (2 - 2 " 4) 

where 

b = the intercept at zero legibility 
c = the exponential decay constant 

This model for legibility is a physically reasonable statistic in that it 
is asymptotic (in contrast to the piece-wise linear model). As resolution 
frequency increases, the illegibility (1 - P ), decreases in direct pro- 
portion to the remaining illegibility. 
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Fig. 2.2-4. REGRESSION LINE FOR 
SQUARE ELEMENTS. 


Fig. 2.2-5. REGRESSION LINE FOR 
RECTANGULAR ELEMENTS ALONG 
t = 0.5, 2.0 



This is not the same as the dimension along the 45° ray which is: 

1.414 f* = /(f*) 2 + (f*) 2 , for f* = f* = f* elem. /stroke 

hv \J h v ’ h v hv 

The next step was to combine Eq. 2. 2-1, -2, and-3 into a 

3-dimensional analytic expression for legibility over all resolutions. 
The exponential form (Eq. 2.2-3) is like a boundary condition for the 
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PERCENT LEGIBILITY, 




hyperbolas (Eq. 2.2-1) along the 45°ray. 
yielded: 


Solving Eq. 2.2-3 for 



f hv = b " ° ln(1 ” P rc ) * f ° r f hv - b (2.2-5) 

Next the constraint of Eq. 2.2-2 was combined with Eq. 2.2-1 and solved 

for the case where f* = f* : 

h v 


= + v (2 * 2 - 6) 

Combining Eq, 2,2-5 and -6, and solving for K produced: 

K = [b-c ln(l - P ) - a v ] 2 , for c ln(l - P ) >0 (2.2-7) 

rc nv re — 

Having solved for K at the f* = f* boundary condition, insertion of 
Eq. 2.2-7 into Eq. 2.2-1 yielded the complete form: 

<f h - a hv> (f t - - Cb - 0 ln<1 - '’rc 1 - a h/' for c ln<1 - P rc ) >° 

( 2 . 2 - 8 ) 


And solving for P produced: 

■M 1 


P = 1 - exp 
rc 


a ) (f* - a ) + a - b j l , 

hv v hv hv /J 


for 


a. ) (f - a. ) + a > b 
hv v hv hv - 


(2.2-9) 


The form of Eq. 2.2-4 has been preserved, given the relationship: 


f* 

hv 


- 


a.) «* “ a v.J + a T.„, for f* = f* = it 


hv v v hv' hv’ 


v hv 


( 2 . 2 - 10 ) 


Notice that the same exponential character can be preserved 
away from the 45° ray. Along rays emanating from an origin, a jj V » an£ * 
with a slope defined as: 


t = 


(f v- a hv> 
(f h ~ a hv ) 


( 2 . 2 - 11 ) 
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F ° r a hv = °’ the form for Eq * 2 - 2-9 becomes: 

f * sft > b (2. 2-12a) 

n — > 


Examples of the rays being described can be seen in Fig. 2.2-2 emanating 
from an origin (0,0). The term ^/t' represents the scaling required to 
express distance along the ray in terns of the horizontal axis. A similar 
expression can be obtained for f* , reflecting the scale along such a 
ray as Eq. 2.2-12 in terms of the vertical axis. 

This fact permitted initial checks of the proposed model. 
Data points along rays such as in Fig. 2.2-2 could be plotted on semi- 
log paper to see if the results looked straight enough. Figure 2.2-5 
illustrates such a plot. The sixteen data points are from 30° and 60° 
rays used in combination on the assumption of symmetry (t=0.5 and 2.0). 

The development of the analytic model allows a fit to the 
data using only one overall semi-log plot (rather than one for each ray). 

Since only the dependent variable, P , is statistical, the independent 

x*c 

variables, f* and f*, that define a data point, can be readily trans- 
formed. When the f* and f* values for a data point are mapped into 

h v 

an equivalent f* value using the model: 


P rc = 1 - exp H( f h ' /F - *>)]• £or 



-J 


(f* - a ) (f* - a. ) + a. 
h hv v hv hv 


( 2 . 2 - 10 ) 


then all data points can be fitted simultaneously on a plot of ln(l - P ) 

* O 

vs f? . This will be illustrated in the next section for various values 
hv 

of a hV 


2.2-2. Fitting the model using multiple regression analysis . 

The exponential and hyperbolic structure of the legibility 

data suggested the functional relationship of Eq, 2.2-9. This function 

can be considered a regression equation with multiple independent variables 

f* and f^, and dependent variable P^. The other parameters such as 

a. , b, and c then become the coefficients to be fitted in the 
hv' 
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regression analysis. (Some good references for the statistical discus- 
sion in this section are [Ezekiel and Fox - 1963], and [Ostle - 1963])* 

The first analysis was done on the exponential relationship 
of Eq. 2.2-4. Here, only the data for square elements was used. Of spec- 
ial interest was the parameter b, which has physical meaning. Accord- 
ing to the system design (Appendix 1), total dropout should occur for 
strokes less than half the width of the sampling interval, when dh = dv: 


P 

rc 


= 0 , 


OJ 

for ff = -2- <0.5 
hv d, — 
hv 


For the analytic expression of Eq. 2.2-4: 


P 

rc 


= 0 , 


for 



= b 


The critical point is how well the physical constraint 

/ 

b = 0.5 (2.2-13) 


matches to the data taken. 

For mathematical tractability a linear analysis of the 
regression equation was performed, using the transformed expression 
(Eq. 2.2-3) obtained by taking the natural logarithm of Eq. 2.2-4. This 
fits the linear regression form 


Y = 


+ A X 


with 



Y = ln(l - P ) 
rc 



All the data points on Fig. 2.2-4 were used except the single value where 
f hv < 0,5 (two other data points where (l - P ) = 0, could also not be 
used nor plotted) . 
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The results of the analysis were that: 


a = 1. 837 ± 0 . 294 
a L = 3.929 ± 0.248 
S y - 0.445 

where the accompanying intervals are the standard error of the regression 
coefficients, and Sy is the standard error of estimate (standard devia- 
tion about the regression line) defined as: 

S Y = (N- 2) 

Transformed back into a probability by taking the antilog yields: 

S 

e = 1.560 

which is an expression for multiplicative error which increases with the 

value of illegibility, (1-P ’)• 

To properly evaluate the regression results, residuals 

between real and estimated values must be computed in the original 

domain of (1 - P ) : 
re 


where : 


? <l i-v 


s = _ , 

L (N-2) 
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are the estimates of taken from the estimate in the transformed 

analysis. This additive standard error of estimate for turns out 

to be: 

S T = 9.14 

Li 


analysis was 


The adjusted coefficient of determination, 


for the Y 
Y 


r| = 0.981 


This would usually imply an estimate that 98. 1 percent of the variation 
in legibility is accounted for by the regression model. Caution must be 
observed, however, in interpreting this parameter; as well as in in- 
terpreting the standard error of the regression coefficients. Their 
meaning is ordinarily based upon random sampling of the independent vari- 
able from an underlying normal distribution. In this experiment the in- 
dependent variables are controlled, and in effect their distribution has 
been chosen in advance. Under these circumstances the meanings of the 
coefficient of determination and standard error of regression coefficients 
are restricted to the specific distribution of this experiment. Thus, 
since the variance of the set of 100 resolution combinations is arbitrary 
comparisons based on this variance are only valid for the exact same set 
of observations. It will turn out that even in this restrictive sense 
these parameters are useful. They can be used for comparison between 
various models fitted to both the uppercase and lowercase data, since 
the identical resolution values are used in all the analyses. 

Returning to the evaluation of results from the regression 
analysis, solving for b gives: 

a o 

b = - — - 0.468 
3 1 

This result is strongly encouraging, being close to the theoretical value 
of 0.5 that was anticipated. A value slightly lower than 0.5 could be 
caused by intersections of strokes in a character that might not drop 
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out yet at = 0.5. Also, variations in and a^ over one stan- 

dard error would produce values of b ranging from 0.578 to 0.369, 
easily including the point 0.5. The value for the second parameter, the 
decay constant, c, was: 

c = - — = 0.255 ± 0.016 
a. 


Having found b close to 0.5, I decided to fix it to its 
theoretical value and repeat the regression analysis without a constant 
term using: 

X = If - 0.5 
hv 

Y = ln(l - P ) 
rc 



The result has only a slightly larger value for a^ 

a Q = 0.0 + 0.110 
= -4.068 ± 0.147 

and the value for c turned out to be similar: 

c = - — = 0.246 ± 0.009 
3 1 

It is this fit for with b =0.5, that is plotted as a dotted line 

on Fig. 2.2-3 and -4. 

Similar measurements were made for lowercase Mid-Century 
type, and upper- and lowercase Dual Gothic type, (Documents B1-B5) . 

These results are summarized in Table 2.2-1 below: 
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Table 2.2-1 


REGRESSION ANALYSES FOR SQUARE ELEMENTS 


Type Font 


a o 

a i 

b 

c 

** 

S L 









Uppercase, 

17 

1.837 ±0.294 

-3.929 ±0.248 

0.468 

0.255 

0.943 

11.43 

Mid-Century 


0.0 

±0.110 

-4.068 ±0. 147 

0.5 

0.246 

0.981 

9.11 

Lowercase, 

17 

2.299 ±0.332 

-3.564 ±0. 177 

0.585 

0.255 

0.929 

14.45 

Mid-Century 


0.0 

± 0.131 

-3.564 ±0. 177 

0.5 

0.273 

0.964 

15.22 

Uppercase 

Dual-Gothic 

5 

0.0 

±0.272 

-5.504 ±0.455 

0.5 

0.183 

0.981 


Lowercase, 
Dual- Gothic 

5 

0.0 

±0.344 

-6 . 228 ± 0 . 562 

0 , 5 

0.161 

0.976 

— — — 


Case A. a, = 0.0: 

hv — 

A simple regression was next performed to plot the line 
in Fig. 2.2-5. The question was whether the data looked like a 1-e 
form along the rays of Fig. 2.2-2. Having been plotted in terms of 
equivalent square element dimensions, f* v ^t", a comparison could be 
made directly with the data in Fig. 2.2-4. The encouraging results in- 
cluded an intercept near 0.5. As a result, the assumption a^ v = 0 was 
first used to fit all the data using: 

P rc = 1 - ^ [- Ms/^- b )] ' l0r b <2 ' 2 - 9a) 

Figure 2. 2-6a illustrates all the uppercase data for Mid- 
Century type, with the independent variables f* and f* transformed 
into equivalent square-element dimensions: 

f* = /f*f* (2.2-10a) 

hv n h v 

and plotted as before, using Eq. 2.2-3. Figure 2. 2-6b shows the same 
data for lowercase, Mid-Century type. For no constraint on a Q , the 
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intercepts of their regression lines were at b = 0,445 and 0.569 respec- 


tively (with b = 0.5 well within their variations as a Q and a^ range 
over one standard error). The dashed regression lines that are shown 
come from the successive analyses that were constrained to pass through 
b = 0.5. 

The regression analyses for the data in Fig. 2.2-6a and b, 
are summarized in Table 2.2-2: 


Table 2,2-2 

REGRESSION ANALYSES FOR ALL ELEMENTS, WITH a^ = 0.0 

’ hv 



Comparison with Table 2.2-1 revealed that the standard er- 


ror of estimate for legibility, S , had remained about the same for each 

analysis. Apparently, the addition of data for rectangular scan elements 

had not disturbed the model. 

However, the model with a, = 0 had one drawback when 
9 hv 

the loci of equal legibility were considered. Equation 2.2-1 had become: 

f*f* = K*, for P [K*] = K. (2.2-la) 

h v rc L 

These hyperbolas with origin (0,0) were also loci of equal scan element 
area: 

2 

w 

f*f* - ® (2.2-14) 

h v A 

But according to previous observations [Arps, et al - 1968] 
along loci of constant legibility, square scan elements had the largest 
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surface area. Stated in another way, given values along Eq. 2.2-14, 
legibility was maximum for square elements: 


p rc nt*] = k l , 

for f* = f* 

h v 

< \ ■ 

for f* A f* 

h v 


(2.2-15) 


The inequality in Eq, 2.2-15 is also consistent with the theoretical 
model proposed in section 2.2.2. When non-square elements are modeled, 
their legibility is determined entirely by the smaller resolution fre- 
quency. 

These contradictions could be resolved by reverting to 
the general hyperbola of Eq. 2.2-1 with the constraint: 

a, > 0 (2.2-16) 

hv 


Case B. a,_ = 0.5. 

— hv 

Further consideration of the theoretical model raised the 
possibility that the analytic model of Eq. 2.2-9 could be matched at more 
points than just b = 0.5 along the 45° ray, Using a^ v =0.5, b = 0.5, 
it becomes: 

P rc = 1 - «*[- ^-0-5)] , 

for J(fJ- 0.5)(f*-0.5) > 0.5 (2.2-9b) 


along the asymptotes (f*, 0.5) and (0.5, f*) the legibility goes to 
zero, as in the case for the theoretical model (see Fig. 2.1-4). As a 
result, the origin for the hyperbolas was moved to (0.5, 0.5), resulting 
in rays with the form: 


rc 


= 1 - exp I 


i(f*- 0.5) J7] 


(2. 2-12b) 
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3.0 


2.5 


J 2.0 
8 

m 

| 1.5 

* > 

1.0 


0.5 


°.gi 


Fig. 2. 


as shown i 


for b = 0. 


X = 


(f* -0.5) (f* -0.5) 
n v 

ln(l - P ) 
re 

0 

l 
c 

and the analyses are summarized in Table 2.2-3: 


Table 2.2-3 

REGRESSION ANALYSES FOR ALL ELEMENTS, WITH a^ = 0.5 

hv 


Type Font 

N 

a o 

3 1 

b 

e 

4 

S L 

Uppercase, 

Mid-Century 

89 

0.0 ±0.058 

-3.747 ±0.068 

0.5 

0.267 

0. 973 

8.74 

Lowercase, 

Mid-Century 

89 

0.0 ± 0.056 

-3.206 ±0.065 

0.5 

0.312 

0.965 

12.37 


The standard errors of estimate for legibility, S T , 

L 

showed significant improvement over previous results. This confirmed 

the desirability of modeling the hyperbolic origin at (0.5,0. 5). The 

2 

coefficient of determination, R^, also improved steadily with each 
step; and the decay constants, c, are closer to the values obtained 
for square elements alone. 

One further refinement was attempted. A second order 
effect was found for data points progressively farther from the 45° 
ray for both case A and B. To illustrate this, the farthest points 
were marked with squares in the resolution planes of Fig. 2.2-2 and 2.2-7. 
Their corresponding legibilities were also marked this way when plotted 
in Fig. 2.2-6 and 2.2-8. These data points line up with a negative slope 
that is less than for the other points on the figures. This could be 
accommodated in the model by changing the square root in Eq. 2.2-9 to an 
arbitrary exponential value, g. 

For Case B, Eq. 2.2-9b was transformed into the form: 

ln[-c ln(l- P )] = g[(f? -6.5) (f*-0.5)] (2.2-17) 

rc n v 
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_ 0.5)(fy -Q.5)* (elements /stroke) 05 ^/(f^-0.5)(f*-0.5)' (elements /stroke) 


PROBABILITY OF ILLEGIBILITY, (l-P rr ) 
o 



Uppercase 


PROBABILITY OF ILLEGIBILITY, (l-P rr ) 
O rc 



Fig. 2.2-8. REGRESSION LINE FOR ALL ELEMENTS, WITH a hv = 0.5. 
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and regression coefficients calculated, using 

(f* -0.5) (f* -0.5) 
n v 

In [-0.267 ln(l-P )] 
rc 

0 
g 

having used the value for c for uppercase from Table 2.2-3. As a result, 
the value for the exponent became: 

g = 0.524 ± 0.016 



confirming that a departure from g =0.5 might increase the fit to the 
data. 

A similar analysis was performed using just the square 
element data to evaluate the choice of 1 - e with power g = 1: 


X = f* -0.5 
hv 

Y == ln[-0. 246 ln(l - P )] 

rc 

a Q =0 



having used the value for c for uppercase with b = 0.5, from Table 
2.2-2. This time the exponent came out quite close to its proper value 


g = 1.014 ± 0.059 


with g = 1.0 well within one standard error. 

In summary, the analytic model proposed for legibility is: 



(2.2-18) 


35 


SEL-68-102 



with c = 0.267 and 0.312 for upper- and lowercase Mid-Century type 
respectively. The corresponding standard errors of estimate measured 
for the model were 8.74 percent and 12,37 percent. For just square ele- 
ments, this model degenerates to: 


i(£* 

c hv 


with the same values of c for Mid-Century type and e = 0,182 and 0.161 
having been measured respectively for upper- and lowercase Dual-Gothic type. 


SEL-68-102 


36 



Chapter III 


THE ENTROPY OF PRINTED MATTER WITH SPATIAL FREQUENCY 

With a printed document digitized as discussed previously, attention 
can be focused next on the number of binary digits used to represent it. 
The dissection itself produces binary "elements" (with dimensions speci- 
fied by the resolution grid). Information theory indicates the potential 
for encoding such messages into representations that, on the average, have 
fewer binary digits than uncoded representations [Shannon - 1948]. 

This compression of average message "length" stems from a-priori 
knowledge about the messages to be processed. Here, the emphasis on 
printed matter specifies messages (images) with a distinctive statisti- 
cal structure suited to compression coding. The next sections deal with 
this statistical structure and the scan patterns to extract it. An em- 
pirical model is presented expressing the entropy of these scan patterns 
as a function of resolution, character size and density, and page area. 
Finally, a measure for efficient compression is proposed. 

3.1. Source Alphabets for Line-by-Line Scanning 

The problem of synthesizing a compression code is classically 
defined in terms of a "source," its "alphabets," and a-priori distribu- 
tions over these alphabets. Minimum average length codes can be designed 
to match any of the given distributions [Huffman - 1952] . In this prob- 
lem, the source is output from a document scanner. Elements in the out- 
put stream may be grouped in a variety of ways to form "patterns," that 
map into symbols of a source alphabet. The statistics of each such source 
alphabet and the ultimate compression it achieves, depends on the match 
between its scan pattern and typical input images. 

The following subsections explore various scan patterns, their 
resulting statistics, and the compression achieved for document group Cl 
(described in Appendix A. 2). Intuitively, the objective is to find scan 
patterns that define natural source alphabets— decompositions that capture 
the information present in a typical image with a minimum of accompanying 
redundancy. 
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3, 1 . 1 Element alphabets 


Consider the source alphabet, X: 

X ~ { x o’ x i } 

The decomposition of an image for X is a mapping of single black or 
white output elements into the symbols x^ or x^ respectively. This 
simple mapping uses a scan pattern extending over only one element for 
decomposing the input image. Figure 3.1-la illustrates decomposition of 
four elements in the horizontal scan of an idealized letter "H. " The 
scan is shown producing symbols from alphabet X in the order x^, x , 

x l’ x l* 

Another pattern, defined as covering groups of n succes- 
sive elements, can decompose the image into an alphabet of 2 n symbols. 
This alphabet, assuming successive scan elements are independent, is the 
n^* 1 extension of X, written X n : 


n n n 


X = {xQ.x^Xg, 


. , x 11 } where m = 2 n - 1 
’ m J 


Figure 3.1-lb illustrates decomposition of the four elements into only 

nd 

two symbols, using the 2 extension of X as the alphabet. Note that 

for line-by-line scanning "successive" elements are physically adjacent. 

In other schemes such as Pseudo-Random Scanning successive elements would 

be distributed about the image almost at random [Huang - 1964], 

The encoding process consists of mapping symbols from the 

source alphabet one-to-one into equivalent binary "words." This is done 

to represent document images with (on the average) fewer binary digits. 

th 

The average length of these words, L n> can be bounded for any n 
extension using Shannon's First Theorem [Abramson - 1963]: 

L . 

H[X] < < H[X] + - bits/symbol (3.1.1) 

— n n v 

Here, H[X] is the entropy of the source— defined over the probability 
distribution of the alphabet symbols: 


H[X] = - ^ P[x 1 ] log 2 P[x 1 ] bits/symbol 

i 


(3. 1-2) 
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Fig. 3.1-1. EXAMPLES OF 1-D SCAN PATTERNS 
(HORIZONTAL LINE-BY-LINE SCANNING.) 


As an example, let us use the distribution of X for the 
Pica type document Cl. Of the 730,281 elements sampled, 9.5 percent were 
black. Expressed as estimates with accompanying standard error, the sym- 
bol probabilities are: 

P[x Q ] 0.095 ± 0.0003 
P[ Xl ] 0.905 ± 0.0003 
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Assuming this to be the actual distribution, the entropy is calculated 
and Eqs. 3.1-1 and 2 yield: 


H[X] = 0.453 

L , 

0.453 < ~ < 0.453 + - 
— n n 


bits/symbol 


bits/symbol 


(3.1-3) 


The 1st extension, for example, would require words with average length: 


0.453 < L 1 < 1.453 bits 

The 2nd extension would require longer words with double the entropy: 
0.906 < L 2 < 1.906 bits 

The 2 na extension, however, has the source symbols grouped in twos. The 
average length per original') symbol is more meaningful for comparison: 

/S 

L 2 

0.453 < — < 0.953 bits/symbol 

2 

Notice that the upper bound has tightened. 

Dependency can also be assumed within the scan patterns 

of X 2 , and distributions estimated directly. This is in contrast to 

2 

using X, assuming independence, and deriving Pfx^s as the product 
of two Pfx^s. Defining these groups of two as symbols from a joint 
source alphabet, [X,Y], with 

X = {x 0 =w, x 1 =b} , at time t 

y = {;y 0 f w > y il =b } » at time + D 

A new joint entropy with four terms may be estimated: 


SEL-68-102 


40 



H[X,Y] - - p [ x .,y.j] log P [x . , y . ] bits/symbol (3.1-4) 

. . 1 J * 1 J 

1 J 

The resultant entropy calculation yields: 

H[X, Y] = 0.741 bits/symbol 

0.741 < L^ < 1.741 bits/symbol 

The dependency present in groups of two can also be uti- 
lized by computing the conditional distributions for an element, given 

s *fc 

the preceding scan element. The source is then modeled as a 1 order 
Markov chain (Fig. 3.1-2) with possible states b and w for the suc- 
cessive trials X and Y, The conditional entropy for each possible 
x^ is calculated using: 

H[y|x=x^] = - P[y Jx^ log 2 PCy^lx^] bits/symbol (3.1-5) 

j 

P[b ' b] CQ — mi'Ir -^GO P[w|w] 

Fig. 3.1-2. MARKOV MODEL FOR CONDITIONAL SOURCE. 

This model is treated as two cases, depending on the prior state: 

H[Y|X=x 0 ] = 0,930 bits/symbol 

H[T|X=x t ) = 0.232 bits/symbol 

Similarly, Shannon's First Theorem is applied for each case: 

0.930 < L^ < 1.930 , given X=Xq 

A 

0.232 < < 1.232 , given Xs=x^ 
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The conditional entropy, given the entire set X, is a 
weighted average: 

H[Y jx] = P[x Q ] H[Y |X=x Q ] + Ptx L ] H[Y|X=x 1 ] bits/symbol (3.1-6) 

with value: 

H[Y|X] = 0,288 


The average word length becomes: 

0.288 < J, < 1.288 

Although the joint and conditional entropies both take 
dependency between two elements into consideration, they utilize it in 
different ways. An alphabet based on the joint entropy sends scanner 
output for groups of two elements; entropy must be halved to compute the 
bits per element. The conditional approach only sends output for one 
element, using one of two alphabets depending on the prior state. The 
two can be reconciled using the identity: 


H [X, Y] = H[X] + H[Y|X] 


which can be verified using the entropies in the previous examples. The 
conditional entropy is just the additional information required to com- 
plete the task of sending both scan elements. 

A good measure for comparison of source alphabets is the 
compression, C n , defined as: 

C = HE 

n L bits/symbol (3.1-7) 

n 

This is simply the reciprocal of previous expressions, adjusted by the 
number, p, of image elements in a source symbol or pattern. Combining 
with Eq. 3.1-1 gives a bounded relationship for the compression: 

p p 

• ■ " — " -y < c n < ifExj elements/bit 
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Compression values for the previous examples are as follows: 
For [X] : 

p = 1 elements/symbol 

0.69 < < 2.21 elements/bit 


For [X 2 ] : 


For [X, Y] : 


For [Y|X] : 


p = 2 elements/ symbol 

A 

1.05 < < 2.21 elements/bit 

p = 2 elements/symbol 

A 

1.15 < < 2.70 elements/bit 

p = 1 elements/symbol 

As 

0.776 < < 3.46 elements/bit 


My purpose in elaborating in such detail with examples, 
is to convey some of the "feel" for these definitions to be used through- 
out the next sections. Attention will be focused on estimates of the 

upper bound for compression, and its dependency on the scan pattern 

til 

chosen. Notice that the using of n extensions serves merely to tight- 
en the lower bound. This expresses the improvement actually obtained 
"fch 

by using n extension codes to approach the upper limit for C. These 
theoretical coding techniques are well developed. However, the matching 
of source alphabets to significant populations in the real world is the 
challenge. Insight into the real population is required to find "natural" 
source alphabets that match the inherent structure and maximize the upper 
bound for compression. 
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3.1.2, Run- length alphabets 


One natural source alphabet for line-by-line scanning uses 
variable scan patterns that cover the length of a "run" [Laemmel - 1951], 
A run is an adjacent series of output elements all of the same "color"— 
black or white. 

The run-length source alphabet [W+B] may be defined in 
terms of two subalphabets, one for white and the other for black runs 


w ±= 

( W W 3’ 

B = 

^ b l ,b 2’ b 3’ 


where the total set has been labeled [W+B] , using "+" to denote the 
union between two subalphabets. The length of run is denoted by the sub- 
script. Figure 3.1-lc illustrates a scan producing set members w^ and 
b , and emphasizes how these scan patterns very in length to match runs. 

O 

The estimated distributions for the two subalphabets are 
shown in Fig. 3.1-3. Although these are discrete densities, the esti- 
mated values have been joined by straight lines to emphasize their shape. 
The horizontal scale has been stretched out to indicate any runs across 
the full page width. Such runs indicating all-white lines, can be seen 
in the distribution for P[W], They have been labeled, along with other 
recognizable characteristics of the printed page, using the following 
notation: 

w = stroke width 
s 

w. = inter-character width 

1C 

w. ss inter-word width 

1W 

w = end- of- sentence width (peculiar to 

ec the test documents Cl and C2) 

w = margin width 
m 

w = page width 
P 

If runs are independent, [Capon - 19593 showed that they can be modeled 
as if generated by the Markov source for [Yjx] in Fig. 3.1-2. The dis- 
tribution of runs is just the geometric distribution for first-passage 
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Using these distributions, the entropy for each subalpha- 
bet of run-lengths was estimated: 


H[B] = 2.54 bits/symbol 

HfW] = 4.67 bits/symbol 

The weighted averages of scan pattern lengths, p , are needed for 

3 V© 

computing compression: 

S ave [B1 - I 1 
i 

p [W] = V i P [w . ] 
ave ^ i 

i 

The bounds on compression estimated for each subalphabet using Eq. 3, 1-6 
are: 

0.81 < [B] < 1-13 elements/bit 

4.70 < (^[W] < 5.71 elements/bit 

For the total alphabet, these can be combined as a weighted 
average of the bits each subalphabet generates per input element (hence, 
reciprocals are required) : 


— 2.88 elements/symbol 


= 26.62 elements/symbol 


t Pra L ] P[a Q ] 

C [W+B] = C~[wT + C [W] 
n n n 


bits/element (3. 1-8) 


where the weights are just the probability of black or white elements as 
measured previously for the alphabet, X. The combined compression is: 

3.24 < C 1 1W+B] < 4.14 elements/bit 

expressed here for suitable comparison with the preceding results. 
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Compressions for the various one-dimensional scan patterns 
are summarized in Table 3.1-1. Accompanying the data for Pica type docu- 
ment Cl, is data for an identical document that used Dual-Gothic type. 


Table 3.1-1 

COMPRESSION FOR SOURCE ALPHABETS 
(ONE-DIMENSIONAL) 


Source 

Document Cl 

Document C2 ‘ 

Alphabet 


(Pica) 

(Dual-Gothic) 





[X] 

2.21 

element s/bit 

2.51 elements/bit 

[X, Y] 

2.70 

t! 11 

3.00 " " 

[Y|X] 

3.46 

11 It 

3.76 " " 

[W+B] 

4.14 

tt tt 

4.73 " M 

Blackness 


9 , 5$ 

8.0$ 

Resolution 


8x8 mil 

8x8 mil 

(absolute) 




Resolution 

1, 

,62 X 1.62 

1.25 X 1.25 

(normalized) 





If a binary source can be modeled as a two-state, I s '*'- 
order Markov chain, the first passage times for the states represent runs 
[Capon - 1959] , The first-passage times and hence the runs for the two states 
are simply geometrically distributed. Thus, only one parameter charac- 
terizes each distribution, and these turn out to be relatively easy to 
measure (distributions calculated by Capon's Model are superimposed on 
the actual run distributions in Fig. 3.1-3 for comparison.) The compres- 
sion for the Markov source, H[y|x] , was shown by Capon to equal the 
compression obtained by instead encoding its run-lengths (if compared over 
a large sample) . He also proved that for this model, the runs are inde- 
pendently distributed. 
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As may be seen in Table 3.1-1, C^[W+B] is greater than 

C 1 (Y|X) for both type fonts. Capon also found the same to be true in 

comparison with measurements by [Deutsch - 1956]. However, having used 

a sample over 100 times as large (730,281 vs 5,075), these variations are 

unlikely to be due to sample size. The lower entropy per element esti^ 

s t 

mated from actual run- length frequencies, implies greater than 1 order 
dependence between scan elements. In the next section, the assumption of 
independence between runs will also be examined. 

3.2. Extracting the Dependency in Images of Printed Matter 

The compression attainable by encoding the dependency of entire 
runs prompted further thought about the characteristics of dissected 
print. After scanning through a few strokes, the width of subsequent 
strokes for that document should be easy to guess. The regularity in 
character structure allows further predictions to be made. Following a 
stroke might be an intra-character space — like between the vertical sides 
of an H, D, or 0. Other possibilities would be the space between 
letters or words. 

What is important is that organizing the image into runs gets 
at these salient features. A run increases its length until the stroke 
or space that it consists of has been completed. In a sense, the run in- 
creases until that stroke or space has been "measured. 11 This is evident 
in Fig. 3.1-3, where features from the character structure were readily 
identified. By contrast, blocking consecutive scan elements into fixed- 
length words attacks the dependency, but in a somewhat random manner. 
Criticism of the approach used to match the alphabet to source character- 
istics is not based on just aesthetic reasons. Organizing the scan data 
in a meaningful way is usually rewarded with greater compression as well 
as insight. 

Having captured the structure of print in using runs, and having 
recognized that prediction of structure is possible, the next step is to 
investigate whether runs are predictable. Another way of looking at this 
question is to ask whether dependency exists between runs. 


SEL-68-102 


48 



3.2.1. A Markov Model for the dependence between runs 


To characterize possible dependence between runs, a finite 
Markov model was used as shown in Fig. 3.2-1. 



Fig. 3.2-1, MARKOV MODEL OF DEPENDENCY 
BETWEEN RUNS 

s t 

Here I have assumed the 1 -order Markov property, that dependence is 
limited to the immediately preceding state: 


PfX i -x i X^_^_ X i_i> • • • > Xq- Xq] — PtX^ x. X^^^— x^_^] (3.2—1) 


where is a random variable representing the state of the model at 

time i. (A good reference on Markov chains, including their applica- 
tion to information sources, can be found in [Ash - 1965]. For a statis- 
tical reference on estimation of transition probabilities, see [Mood and 
Graybill - 1963].) The Markov chain is shown with a state for each pos- 
sible run of black or white. Its transition matrix has the form: 
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P[W+B|W+B]= 


0 

0 

• • # 

0 

p w b 
11 

P w b 
12 

• * * 

Vn 

0 

0 


0 

p w b 
2 1 

Pw 2 b 2 


Pw 2 b n 





# 



• 

• 



# 

m 



• 

• 




• 




0 

0 

• » • 

0 

P w b t 
n 1 



Kr K 

W D 

n n 

p. 

b l”l 

P, 

b w 
12 

* • * 

P b,w 
1 n 

0 

0 

t • • 

0 

p. 

b 2 W l 

P, 

b 2 W 2 


V 

2 n 

0 

0 


0 

• 




• 



• 

• 




• 



• 

• 




• 




p. 

b w, 
n 1 

p 

b w_ 
n 2 


P b w 
n n 

0 

• » * 


0 


0 

P[B W] 

P[W B] 

0 


( 3 . 2 - 1 ) 


where in the 1 st and 3 rd quadrants: 


P[B|W] = {p = P[w |b ] > 0} 

Vj 3 

P[W|B] = {p^ b = P[bjw ] > 0) 
i j 


and in the 2 nd and 4 t * 1 quadrants ; 


P[W|W] = {p = P[w |w ] as 0} = 0 

W i W J 3 

p[b|b) = {p = p[b |b ] H 0} = o 

i j J 

J 

Subscripts i and j correspond to consecutive increments in time. 

The values in the 2 nd and 4 th quadrants are zero, since 
runs alternate in color. This is illustrated in Fig. 3.2-1 by the absence 
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of self-loops or transitions between runs of the same color. A conse- 
quence of this is that the stochastic matrix: 

p[w+b|w+b] 


can be broken into two stochastic submatrices: 


p tE|W] = (P b w ) 
i 3 


(3. 2-2a) 


such that: 


P[W|B] = {P w b } 

i j 


y p = i 

L-. b . W . 


i 3 


(3. 2-2b) 


P h ~ 1 


s t rd 

for the 1 and 3 quadrants respectively, and i,j = 1,2, 


n. 


In order to estimate the terms for one of the Markov matri- 
ces, frequencies of occurrance, jNL., have to be recorded for the various 
transitions. The conditional probabilities are then estimated using: 


p id =^lv =5^7 (3 ' 2 - 3) 

where: 

n i- = i N u 

j 


This statistic is an unbiased, consistent, maximum- likelihood estimator 
(M. L.E.), and for large samples is approximately distributed with a multi- 
variate normal distribution. The sample variance for the estimator is 
distributed as: 




V 1 -”^ 


(3.2-4) 
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The large values for required to estimate each row 

in the stochastic matrices cause the total number of samples, N, to 
be very large: 


N = for i = 1,2, 

i 


. . . ,n 


The maximum value, n, for i and j is determined by the page width 
and horizontal element dimension: 


n 


For documents Cl and C2 using 8 mil elements: 


8.5 

n Cl " n C2 = 0.008 


1062 


For document C2 using 5 mil elements: 


8.5 

n C2 ~ 0.005 " 


1700 


2 6 
The array sizes are n and thus require on the order of 10 computer 

words for storage. 

This strain on computational facilities was resolved by 
reducing the matrix sizes down to some value, K. States i = 0 and 
j =0 were "runs" of zero length, representing the left and right margin, 
respectively. Runs of (K - 1) or larger were grouped together, while 
runs of (K-2) or less were left as before. Inspection of independent 
distributions like Fig. 3.1-3, revealed that black runs do not exceed 
about 10, and 16, for documents Cl, and C2 (5 mils) respectively. There- 
fore it was arbitrarily decided to set K = 20. (For simplicity, data for 
the boundary conditions i, j=0 and i, j = (K-l) have been deleted in the 
following discussions.) 
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The size of N was arrived at empirically, by increasing the sample 
size until the row distributions stabilized. This could be readily seen 
using computer plots such as Fig. 3.2-2, The size of N ultimately be- 
came the data from half of a standard 8 l / 2 x 11 inch page (the exact regions 
for each test document are indicated in Appendix A. 2): 

h w 

N = ■ ■ £ £ elements 

d, d 
h v 

where : 

h = height scanned on a page 
P 

For the documents Cl, and C2 (5 mils), these sample sizes were: 

N = 730,281 elements 

Cl 

= 1,870,000 elements 

C2 

The successive curves in Fig. 3.2-2 are estimated distributions 
for the 1 st quadrant rows of Eq. 3.2-1. 



Fig. 3.2-2. CONDITIONAL PROBABILITY DISTRIBUTIONS 
FOR THE 1st QUADRANT MARKOV SUBMATRIX, P[B|W]. 
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These discrete distributions were again represented as curves, by joining 
their values with straight lines. They each represent a distribution: 

P[B|w i ] , for i = 1,2, . ..,18 

that must sum to one (including terms beyond the figure, out to j = n) 

yP[b.|w.]=l, for i =1,2, . ,.,n 
*r? ^ ^ 

J 

and are dependent on some prior white run of length 1 to 18. 

At first glance, the results were disappointing. There 
was almost no dependence to be seen, as evidenced by the similarity be- 
tween planes. In other words, the distributions for black runs appeared 
almost independent of the preceding white run. These distributions for 
P[B)W] just resembled the distribution for P[B] in Fig. 3.1-3 (they 
both were derived from the same sample of document Cl) . The obvious 

salient feature was the stroke width, w (expressed in terms of con- 

s 

secutive black elements, B, rather than its dimension in mils). 

On the other hand, the distribution displayed in Fig. 

3.2-3 showed marked dependency. 



Fig. 3.2-3. CONDITIONAL PROBABILITY DISTRIBUTION 
FOR THE 3 rd QUADRANT MARKOV SUBMATRIX, P[W|B]. 
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Here the rows are: 


P[w|b i ], for i = 1,2, .,.,18 

The inter-character width, w. , and inter-word width, w, , from 

* ic* ’ iw 

Fig. 3. 1-3 are easily recognized. Unlike the topography for w in the 

s 

preceding figure, these regions are not parallel to the B axis. In- 
stead they tend toward dependency with B of the form: 

B + W = constant (3.2-5) 

Such variation is due to spatial quantization of the original characters. 

Whereas the width of a character may be standard, it will be digitized 

with varying combinations of B and W, depending on relative phase 

with the scan element. This tradeoff between B and W for an adjacent 

stroke and space is such that their sum should stay constant. 

A feature heretofore lumped together with w ic (in Fig. 

3.1-3) is labeled w. , and defined as: 

is 


w. = inter-stroke width 
is 

Its characteristic was not apparent in the marginal distribution of whole 

runs, P[W], because w. and w. overlap. One can visualize this 

overlap as viewing the topology in Fig. 3.2-3 while standing on the W 

axis. Notice that w. should be an especially appropriate example of 

i s 

Eq. 3.2-5, being that it is postulated in terms of stroke and inter- 
stroke widths. This is borne out by the topology for w is > which comes 
clos6 to following a line with -45° slope in the B-W plane. 
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2 

3.2.2. v test for dependence between runs 

Encouraged by the apparent dependency in Fig. 3.2-3, my 
next step was to test this hypothesis statistically. The classical meth- 
ods using likelihood ratios to test independence in contingency tables 
[Neyman and Pearson - 1933], [Wilks - 1935], have more recently been ap- 
plied to the testing of finite Markov chains [Bartlett - 1951], Among 
the problems considered are the estimation of transition probabilities, 
the testing of goodness of fit, and testing of the order of the chain. 

A good survey of statistical methods for Markov chains along with an ex- 
tensive bibliography, can be found in [Billingsley - 1961]. 

In this application, my prime interest is to test the hy- 
pothesis that the chain is of a given order. It is recognized that re- 
sults for the finite, discrete chain are only first approximations to 
such hypotheses for the original stochastic process. One long observation 
of the chain is made, under the assumption that the basic process is sta- 
tionary and ergodic. 

The likelihood ratio used to test the hypothesis that runs 
are independent (that the chain is of zero order) is: 



( 3 . 2 - 6 ) 


For large samples, this is distributed as a chi square: 


= 2 


[N. . - (N. 
13 i' 


i» J 


(N i. 


N 


N ,/N)]' 

• J 

7nT 


for 1, j = 1,2, 


>k 


(3.2-7) 


• J 


2 

with (k - 1) degrees of freedom. It differs from -2 log A by terms of 
the order: 

1 

s/W 
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Since the model can be broken into two stochastic sub- 
matrices, each submatrix was tested individually. This was also done to 
isolate the possibility that the matrix P[B|w] might be of zero order 
(a possible conclusion from Fig. 3.2-2). Document Cl was tested, using 
a large sample (over the area indicated in Appendix A. 2) so that: 

N[B|W].= 19,342 
N[W|B] = 19,059 


causing terms differing from -2 log A of the order: 


VN[B|W] “ °' 0072 

1 = 0.0073 

V N[W|B] 

The computed values for the statistic were: 

X 2 [B|W] = 2589 
X 2 [W|B] = 4361 


Although the value for matrix P[B|w] was noticeably lower, they both 
exceeded a levels of 0.05 and 0.025 with ease. With K = 20 there 
were 381 degrees of freedom, such that [Ha Id -1952]: 

*0.050 <381 > > 406 

4.025 (381) > 415 

were the levels for the test. 

The rows of the two matrices were also tested against the 
null hypothesis that they came from the same population as their marginal 
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distribution. In other words if runs are truly independent, grouping 
them on the basis of the preceding run should not make them depart from 
a zero-order distribution. The latter has to be estimated too, of course; 
but coming from the larger sample, the estimated marginal distribution is 
considered adequate. The results of this test are summarized in Table 
3.2-1. 


Table 3*2-1 

x 2 DATA TO TEST ROW DISTRIBUTIONS 


i 

N [B|W] 

X-.CB|W] 

N i# [W|B] 

Xi.CWjB] 

1 

678. 

443. 

3301. 

233. 

2 

1077. 

345. 

8524. 

317 

3 

2045. 

242. 

2896. 

721. 

4 

2924. 

179. 

1241. 

446. 

5 

3055. 

209. 

807. 

323. 

6 

1757. 

321. 

613. 

264. 

7 

1050. 

188. 

592'. 

437. 

8 

798. 

60. 

441. 

282. 

9 

649. 

43. 

362. 

157. 

10 

411. 

26. 

150. 

172. 

11 

303. 

43. 

9. 

33. 

12 

229. 

40. 

0 . . 

0 . 

13 

140. 

45. 

0. i 

0 . 

14 

58. 

42. 

0 . 

0 . 

15 

111. 

35. 

0 . 

0 . 

16 

308. 

47. 

0. 

0 . 

17 

431. 

53. 

0 . 

0 . 

18 

352. 

88. 

0 . 

0 . 


Except for the empty rows in matrix [W J B] (beyond W = 10) , 
all rows had a total frequency of at least: 

N. > 58 
!• — 
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such that; 


1 


■]*r. 


0.131 


The a-level for (K-l)=19 degrees of freedom: 

4 . 05 <19) > M- 1 

was only missed by the 10 th row of matrix P[b|W]. 

It had value: 

X J[ B | W = w 1Q ] = 26 

which would exceed the 0.2 a-level for 19 d.f. Perhaps its distribution 
too closely resembled the estimated marginal. Considering that only one 
out of the twenty-eight rows fell below the test level, this could also 
be attributed to Type II error. 

Higher orders of dependency were not explored due to prac- 
tical limits on computation. The evidence of first order dependence was 
sufficient to encourage further study on matching alphabets. Source 
alphabets based on this dependence produced noticeable increases in com- 
pression, as will be shown next. 


3.2.3. Alphabets for the dependence between runs 

Just as comparisons were made earlier between sources 
with alphabets [X], [X, Y] , and [Y)x]; comparison may now be made between 

alphabets [W+B] , [W+B,W+B], and [W+B | W+B] . In this case, source alpha- 

bets based on the marginal, joint, or conditional entropies of runs rather 
than elements are being considered. 

Figures 3.2-4 and 5 illustrate joint distributions for 
stochastic submatrices of the matrix: 


P[W+B, W+B] 



(3.2-8) 
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where analogous to the definitions used with Eq. 3.2-1: 


P[B, W] = [P[b.,fJ > 0} 
t J “ 

P[W,B] = {P[w.,b j > 0} 

P[W,W] = {P[w.,w.] = 0} = 0 
i J 

P[B, B] = [P[b , b J = 0} m 0 

J 

The left and right hand columns in both figures juxtapose estimated dis- 
tributions with distributions derived using Capon’s model (for the same 
raw data). The three rows in each figure are derived from the raw data 
of test documents Cl, C2, and C2 (5 mils) respectively. Figure 3.2-4 enables 
comparison between their P[W,B] distributions, while Fig. 3.2-5 displays 
all the P[B,W] distributions. The approximate nature of a model based 
on 1 st order dependence between elements rather than runs is apparent 
from the fit of Capon’s model to the measured distribution. 

In studying these figures, one should keep in mind that 
the top two rows are for documents scanned with a normalized resolution 
frequency of 1.25 X 1.25. The bottom row document has been scanned at 
2.00 X 2.00. Comparing the bottom two rows which represent the same 
Dual-Gothic document scanned at different resolutions, one can see the 
same topography-- just scaled differently. 

The top two rows allow comparison at the same resolution 
between distributions for Pica and Dual-Gothic type. The resemblance 
between distributions indicates the existence of fundamental properties 
independent of the particular font being used. This is encouraging. A 
common distribution facilitates the synthesis of a mutually compatible 
compression code. One major purpose of this study was to explore whether 
the spatial frequency characteristics of different type fonts would not 
indeed be similar. This is somewhat analogous to mask-matching optical 
character recognition using a Van der Lugt filter in the spatial frequency 
domain [Goodman - 1968]. However, rather than depending on consistent fre- 
quency distributions for individual characters, one need only worry about 
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consistency for the aggregate distribution of all characters. If the 
latter is consistent over a number of fonts, it alludes to the existence 
of a code matched for most printed matter (such as all standard size 
typewriting and printed text). 

In section 3.1, an inequality derived from the grouping 
axiom [Ash - 1965] was illustrated when joint and conditional element 
alphabets were compared: 

H[Y|X] < H[X, Y] (3.2-9) 

This is applicable here for run alphabets: 

H[W+B|W+B] < H[W+B, W+B] 

In order to evaluate conditional source alphabets, the 
estimates of conditional distributions have been extended to both sizes 
of Dual-Gothic type and plotted for comparison in Fig. 3.2-6. (The 
similarity between distributions even appears to have increased, an 
illusion probably due to smaller vertical scale.) A separate code 
dependent on the prior state, must be designed for each of the condi- 
tional submatrices P[W|B] and P[B|W] . To capitalize on this resem- 
blance, a common distribution can be defined as a linear combination of 
individual distributions: 

P o m = ^ a k P k CX] , k = 1,2, . ... ,r (3.2-10) 

k 

where : 

4" Vi 

k, indicates the k x distribution over X 
a^, are coefficients that sum to unity 

X, is the source alphabet (here, a conditional row like: [W|B=b g ]) 

The a are just weights, reflecting the a-priori incidence of the 

K 

different distributions. 
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Fig. 3.2-6. MARKOV (CONDITIONAL) PROBABILITY DISTRIBUTIONS. 
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Since H[X] is a convex function [Ash - 1965]: 


H 0 [X] > bits/symbol (3.2-11) 

k 

where, [X] is the entropy for P Q [X] . Physically, this means that an 
encoder based on P^ [X] , is inferior to a system that recognizes each in- 
coming distribution and matches it with its own encoder. In the adaptive 
system individual encoders approach H^X] bits per symbol, and their 
average rate would approach the right hand side of Eq. 3.2-11. 

For a single encoder, only one distribution can be guaran- 
teed an optimal code. To match a multiplicity of distributions, P Q [X] 
should be best. To argue this, define: 

Q Qk [X] = p 0 log 2^ P k^ X j^ bits/symbol (3.2-12a) 

where for P^[X] = Pq [X] » by definition: 

Q 00 M = H q [X] (3. 2-12b) 

Theorem 3-1: Q rtl . [X] is a convex function. 

■' - ' 1 " '' ' 1 ' 1 : - 1 UK 

In equation form: 


Q oo [x] sI a kV [x] (3 ' 2 - 13> 

k 

with equality iff P [X] = P [X] 

K U 

Proof : 

From Lemma 14.1 of [Ash - 1965]: 

H 0 [X] < Q 0k CX] (3.2-14) 

with equality iff P [X] = P [X] . 

c u 
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(3. 2- 14a) 


Case A. All P ^ CX] £ P p py : 

H q [X] < Q 0k m 

Taking the weighted positive sum of both sides of Eq. 3.2-14: 

I \ v« < 2 \ v [x] 

k k 

and since H p [X] is independent of k: 

H o cxi< 2 a kV m «• 

k 

Case B. All P, [X] = P„ [X] : 

k 0 — 

H o B] -%k [x] «• 

Taking the weighted positive sum of both sides of Eq. 3.2-14b: 

H 0 CX] =S a k Q 0k CX] 

k k 

and again using H p [X] independent of k: 

H 0 ra =1 W 1 < 3 - 

Combining Eq. 3.2-13a and b yields: 

a k Q 0k [« (3 

k 

with equality iff all l*] = P Q [X] . 


2-13a) 


2-14b) 


2-13b) 


.2-15) 
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Apply Eq. 3.2-12b to Eq. 3.2-15, and the theorem is complete: 

«00® < I a k <W B] 

k 

with equality iff all P [X] = P [Xj Q.E.D. 

K U 

The single-encoder-multiple-distribution problem should 
have a code based on P^ [X] . After all, this is the resultant distribu- 
tion the encoder will see. If another code is used, for example matched 
to a specific input distribution P [X] £ P [X] , Q ni CX] for the over- 
all distribution P Q [X] will not be minimum. Thus, writing Theorem 3-1 
in terms of expectations: 

E 0 {-log 2 (P 0 [xJ)} < E 0 {-log 2 (P k [ Xj ])} (3.2-16) 

(where E Q { } represents an expected value over distribution P Q [X] ) . 

For optimal coding [Ash - 1965] code word lengths, A, [xj, satisfy the 

k 3 

inequality: 

-log 2 (P k [xJ) < T^tXj] < -log 2 (P k [Xj]) + 1 

From this the expectation of code word lengths must approximately 
follow: 


E 0 {A 0 [ Xj ]} < E 0 {A k [x.]} (3.2-17) 

(This certainly holds when the inequality in Eq. 3.2-16 exceeds unity.) 

Conditional source entropies for documents Cl, and C2 
have been tabulated in Table 3.2-3 for comparison with the previous 
source alphabets in Table 3.1-1. Computations have been made both for 
K = 20, and K = 40. (Beyond K, since run dependency was lumped 
together, all runs were treated as equiprobable. This only causes 
entropy estimates to err on the high side. As K increased, appropri- 
ate additional decrease in entropy was observed.) Further exploration 
into one-dimensional coding was discontinued, when the following results 
for two-dimensional coding became evident. 
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Table 3.2-3 


COMPRESSION FOR SOURCE ALPHABETS 
(ONE-DIMENSIONAL RUNS) 


Source 

Alphabet 

Document Cl 
(Pica) 

Document C2 
(Dual-Gothic) 

Document C2 
(Dual-Gothic) 

[w + b|w +B ] k=20 

[w+b|w+b 3 k=40 

4.260 elements/bit 
4.419 " " 

4.901 elements/bit 
5.218 " " 

6.578 elements/bit 
7. 153 " ' " 

Blackness 

9. 5$ 

8. Of 

7.2$ 

Resolution 

(absolute) 

8 X 8 mil 

8 X 8 mil 

5X5 mil 

Resolution 

(normalized) 

1.62 X 1.62 

1.25 X 1.25 

2.00 X 2.00 
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3.3. Source Alphabets in Two Dimensions 


The simple source alphabets used for line-by-line scanning are 
readily extended to two dimensions* Alphabets with higher than 1 st order 
Markov dependency are of interest, since a two-dimensional pattern has 
many neighbors. Table 3.3-1 shows results for source alphabet patterns 
(”X"s) some of which are dependent on nearby neighbors ("0"s) . v 


Table 3.3-1 

COMPRESSION FOR SOURCE ALPHABETS 
(TWO-DIMENSIONAL) 


Source 

Alphabet 

Document C2 
(Dual-Gothic) 

Document C2 
(Dual-Gothic) 

H 

2.50 

2.69 

n 

2.32 

4.25 

To o o'] 

r J 

5.80 

10.80 

fo °x °xl 
1_0 X XJ 

6.57 

> 11.15 

1 1 

o 

OXO 

o 

\ 1 

10.77 

23.80 

Blackness 

8.0$ 

7.2$ 

Resolution 

(absolute) 

8x8 mil 

5x5 mil 

Resolution 

(normalized) 

1.25 X 1.25 

2.00 X 2.00 


The pattern in the fifth row of Table 3.3-1 achieves the best re- 
sults. However, dependency has been utilized from all four sides. For 
practical scanning schemes, this is equivalent to non-causal prediction 
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(using elements received at some future time). As a result, the other 
patterns are restricted to dependency on elements that would already be 
received in line-by-line scanning. 

In the fourth row, a source alphabet with four elements is shown 
with dependency based on four preceding elements. This pattern could be 
applied with conditional Huffman coding [Huffman - 1952], one code for 
each of the 16 prior states. 

In the third row, 4^ order Markov dependence is used in a causal 
scheme. Patterns of this type have been explored for two-level data by 
[Wholey - 1961]. Notice that conditional Huffman coding for a one-bit 
alphabet is impractical. However, predictive coding [Elias - 1955] has 
been shown to achieve average message lengths approaching the conditional 
entropy for such a source. 

The second row is just an extra pattern, for which data was 
readily available in measuring for the fourth row. Finally, the first 
row is just the basic single element entropy, provided for future 
comparison. 

Comparing columns illustrates how an increase in resolution 
increases compression for the same pattern. The rates of increase tend 
to improve for the better patterns too. This resolution dependency is 
explored next. 

3.4. The Influence of Resolution on Source Alphabets 

The use of compression to judge encoding schemes suffers from 
a common problem. The results of a particular measurement are hard to 
generalize. Moreover* the same coding scheme applied to the same image 
will indicate greater compression as the resolution is increased. (This 
property can give rise to glowing reports and hinders objective evaluation). 
This section attempts not only to display variations in compression, but 
to model them for a large body of present-day documents. Namely, the 
images of printed matter, found so predominantly in the commercial image- 
processing market. 
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3.4.1, Entropy and compression variations with spatial frequency 
The simple scan patterns in two-dimension presented in sec- 
tion 3.3, were next checked for their variation with frequency. In doing 
so, another sample of Dual-Gothic type was used. Rather than the random 
sample of print in document C2, documents B1-B5 from the legibility study 
were used. This was done to display illegibility (straight dashed line) 
and entropy (solid curves) from the same sample, in order to observe the 
tradeoff for various patterns (Fig. 3.4-1). 



Fig. 3.4-1. ENTROPY VS RESOLUTION FOR TWO-DIMENSIONAL SCAN PATTERNS. 

The set of scan patterns are like successive attempts to 
grasp the true information in the page. They can only approach what a 
person or pattern recognition system can do. The latter is represented 
by a flat curve at the base of the figure, and is based on an assumption 
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of 4.03 bits per alphanumeric character [Abramson - 1963]. A few more 
bits per page were added to this pattern recognition model to account for 
character addressing, character size, and the type font. However, their 
effect turned out to be insignificant. An entropy curve based on pattern 
recognition is also displayed taking legibility into consideration. The 
two pattern recognition curves tend to overlap at normalized frequencies 
greater than one element per stroke. 

At the other end of the scale, the total elements in the 
sample, assuming maximum uncertainty, were plotted as the dashed curve 
marked with P[B] = 0.093. This represents an upper bound for the entropy 
in the sample at one bit per element: 

NH [X] = N bits/page (3.4-1) 

It is interesting how the ineffectual scan patterns tend to 
follow the upper bound based on totally random page elements. Conversely, 
the effective scan patterns are not only better, but tend to be flat with 
increasing resolution. Apparently they are "getting at," or "recognizing" 
some of the information in the page; and after a point, increased resolu- 
tion has little to contribute (as should be the case for pattern recognition) 

For these effective patterns it seems that some form of 
"recognition" is taking place; perhaps recognition of some fundamental 
feature in characters, like their strokes. This is what the run-length 
patterns in one-dimension tended to do. They captured the intervals that 
make up the structure of a character, and seemed less sensitive to the 
particular font in use. 

This stroke recognition by good two-dimensional patterns, 
is borne out by the correlation of results in Fig. 3.4-1 with measurements 
taken from document C2. The characters in document C2 are distributed 
differently but this seems to have little effect on the performance of the 
good patterns. They apparently get their compression from features of 
characters rather than the total character (as will be seen in Fig. 3.4-2) 

To reconcile the data from documents B1-B5 and C2, it was 
necessary to make allowance for the percentage of black vs white elements 
in the sample (as will be seen in section 3.4.2, this percentage is of 
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fundamental signif icance) . It was discovered that the patterns predict 
all-white so well that addition of white elements has little effect on 
the page entropy. This can be explained using the definition of condi- 
tional entropy (Eq. 3.106 generalized): 


H[Y|X] = P[x.] H [YlxJ (3.4-2) 
i 

For the term, H[Yjx^], in which prior state, X , is all-white, the 
entropy is very low. That is, predicting the following state, Y, to 
be all-white has a high probability of success, and: 


H[Y x ] .« 0 
w 


(3.4-3) 


As a result, additions to the number of white page elements only affect 
the probability weights in Eq. 3.4-2, which have been estimated using: 

- N i 

P'CxJ = 
x N 

For the addition of some more white elements, N , the new total becomes 

W 


N* = N + N 


w 


and for only the all-white state, x , does the numerator also change: 

w 


P[xj = 


N. + N 
1 w 


w N + N 


w 


This change in P[x ] has little effect in Eq. 3.4-2, when multiplied 

w 

with its corresponding negligible entropy (Eq. 3.4-3). For the rest: 

N. 

Pt!t i 1 - Finr ■ for x i * \ 

w 


producing little overall effect for small changes in total N: 


N « N 
w 
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As a result, the data in Fig. 3.4-1 can be adjusted to 
8.0 percent white by simply adding more white elements, resulting in a 
new dashed curve for total page elements: 

N'H = N' bits/page (3.4-4) 

max 

Compression, defined previously for individual symbols 
from a source alphabet (Eq. 3.1-7), can now be defined for a page 

C = JL symbols/bit (3.4-5) 

In Fig. 3.4-1, this is simply the ratio between any curve for a scan pat- 
tern and the dashed line representing the number of page elements (for a 
given percentage of black, P[B]). 

When percent black is accounted for, the compression mea- 
surements for documents B1-B5 and C2 tend to correspond. A measure of 
performance independent of the black-white percentages can be defined by 
dividing pattern compression by single element compression: 

1 

c 

Cr = = relative compression (3.4-6) 

where the single element compression is: 


C[X] = 


N 


H[X] 


N 

- P[bT log 2 P[B] + P[w] log 2 P[w] 


(3.4-7) 


For example, the relative compression for document C2 with 8.0 percent 
black, using the scan pattern in the fifth row of Table 3.3-1, is: 


C 


10, 77 
R " 2.50 


4.31 
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The values for document C2 in Table 3.3-1 are plotted with 
n X ,! s in Fig. 3.4-2, whereas measurements for documents B1-B5 are plotted 
with dots. For comparison, relative compression for the run-length schemes 
of Table 3.2-3 have also been included, (labeled with the amount of run 
dependence, K = 20 or K — 40, that they utilized). The idealized curves 
for pattern recognition of page information have also been included. 



Fig. 3.4-2. RELATIVE COMPRESSION VS RESOLUTION FOR 
TWO-DIMENSIONAL SCAN PATTERNS. 

3.4.2. A model for the entropy of simple scan patterns 

While reflecting on the curves in Fig. 3.4.2, one should 
keep in mind that the entropy measurements are only crude approximations 
to the basic information in the underlying stochastic process. An image 
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of printed matter contains a statistical distribution of black and white 
that can be looked upon with different resolutions and with different 
patterns, but the fundamental process is the same. In Fig. 3.4-1 the 
curve for theoretical pattern recognition stays constant with frequency 
(even at a resolution where legibility is low, the original image has 
its information — the system simply isn’t receiving it). 

[Vitushkin - 1961] has rigorously defined the information 
capacity of an image dissection scheme in terms of e-entropy, or resolu- 
tion-dependent entropy. For a uniformly dissected page, the maximum un- 
certainty for the number of binary elements used, is essentially the 
e-entropy for the binary process. This will be recognized as the dashed 
curves in Fig. 3.4-1 which represent the total number of elements for 
different sized pages. 

[McLachlan - 1958] attacked this question in terms of the 
N 

total number of combinations, 2 , that N bits can assume. (Notice 
that this is just the antilog 2 of Vitushkin’ s definition). If this com- 
binatorial approach is applied to a page, known a-priori to have a fixed 
percentage of black elements, an upper information bound can be obtained. 
Assuming the number of black elements to be, M, then the page entropy 
for any pattern should obey the inequality: 

* 

N Vw < [ „.%»] «- 4 - 8 > 

using Sterling’s inequality, for large N: 

N! « C -N N (N "^(2it)^ (3.4-9) 


Eq. 3.4-8 becomes: 


NH P[B] ^ antilog 2 


£ 

(N-M) N-M M 1 (2it)^ 


where the powers of J have been neglected. Evaluating the antilog 

l 

and neglecting the term log 2 (2 ) 2 ^0.8 yields: 
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-PW. <- ■[- F& (H 2 ) - (!) (DJ 


which is recognized as the single element entropy for a page: 

A M 

NH p [B ] < NH p [B ] [X ^ for P[B] - N (3.4-10) 

It is the compression for this single element that has been 
used to define the relative compression used in Fig. 3.4-2. Referring 
back to Fig. 3.4-1, the curve for single- element entropy does bound all 
other measurements. Furthermore, it corresponds closely to the shape of 
nearby curves that seem to have little grasp of the information inherent 
in the page. Conversely, as patterns improve, they flatten out with fre- 
quency and tend to imitate the pattern recognition curve. 

To model the intervening curves, a linear combination of 
these two seems appropriate. Thus: 


where : 
and: 


NH P[B] = a i NH p[ B ] CX3 + a 0 HEPR] 
a Q , 0^, are the linear coefficients 

H[PR] , is the theoretical entropy 
with pattern recognition 


(3.4-11) 


A least- squares fit to this model was performed for curves 
from Fig. 3.4-1, and uses the linear regression program of section 2.2. 
The results are summarized in Table 3.4-1. 

A second fit was obtained, assuming that 


QU = 1 - 


«L 


(3.4-12) 


Then the regression becomes: 


N(Hp [B] - H[PR] ) = ^(NHpj.^ [X] - H[PR] ) (3.4-13) 
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and the equation is constrained to have an intercept of zero. The re- 
sulting values are also shown in Table 3.4-1, along with the standard 

deviation from the regression line, S , and coefficient of determina- 

-2 1 
tion, Here, T, stands for the total page entropy: 

T P[B] - ™P[B] <3 - 4 ' 14 > 


Table 3.4-1 


ENTROPY MODEL COEFFICIENTS 


Scan 

Pattern 

N 

a Q H[PR] 

a l 

a o 

4 

S T 

■ 

aw 

1 

5 

4,943 ± 2,040 

0.706 ±0.036 

3.007 

0. 992 

2,493 


| 








1 

■ 

1 


0 ± 1,823 

0.773 +0.033 

0.227 

0.994 

4,077 

■ 

Si 

1 

5 

5,206 ±1,659 

0.605 ±0.029 

3. 167 

0.993 

2,028 

■ 

■ 

1 







1 


1 


0 ± 1,731 

0.674 ± 0.031 

0.326 

0.994 

3,870 


o o' 


5 

7,627 ± 1,781 

0.278 ±0.031 

4.639 

0.963 

2, 177 


0 X X 









0 X X. 



0 ± 2,299 

0.375 ±0.042 

0.625 

0.964 

5,141 

■ 

mm 

II 

5 

6,415 ± 1,624 

0.124 ±0.029 

3.902 

0.860 

1,986 

■ 


1 







1 

SB 

II 


0±1, 838 

0. 198 ±0.033 

0.902 

0.922 

4, 109 


In addition to resolution, print size, and page density, 
the model needs to include page area: 


2 2 
A = Nd hv mils (3.4-15) 

rather than N. Inserting Eq. 3.4-12, 14, and 15 into Eq. 3.4-11, the 
model becomes: 


T 

P[B] 



iipCBjtx] + (i-a 1 ) H[PR] 


(3.4-16) 
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H[PR] is actually based on the number of characters: 

H[PR] = 4.03 N (3.4-17) 

c 

where: 

N = the number of characters 
c 


However, N should be a function of character density, P[b 3, the page 
c 

area, A, and the area of a character, a : 

c 


N 

c 


P[B] A 
a 

c 


(3.4-18) 


The character area is proportional to the size of its stroke squared: 


where : 


a 

c 



2 

w 


s 


K^, is a proportionality constant 


(3.4-19) 


This constant has been measured for Dual-Gothic type to be: 


K =21,5 
c 

Combining the last four equations produces the final form: 


P[B] A 


T [PB] ~ a i(^2 Vp[B] [X] + (1 '‘vf K c ) Z 

\ hv/ C W s 


bits (3.4-20) 


The model is based on resolution, character size, page density and page 
area: 


T 


along with constants 


P[B] “ T(d hv’ 

QL and K 
1 c 


w ,P[B],A,a ,K ) bits 

S 1 c 

for the scan pattern and type font. 
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For 0^ large, the first term predominates and the scan patterns 
follow the curve for single-element entropy. For 0^ small, the second 
term predominates and the entropy tends to stay constant with frequency. 

As is apparent from Fig. 3.4-1, the total page entropy increases 
with resolution although accompanied by increasing compression, A better 
measure than just compression is needed to express the net results for a 
given scan pattern, especially when using different resolutions. Let the 
values for compression, entropy, and other parameters be measured at the 
Nyquist interval for a stroke, f* v = 1, and these values used as a stan- 
dard for comparison. All departure from conditions at 1 can then 

be embodied in terms of resolution efficiency, which is developed next. 

For printed matter, frequency variations are better expressed 
in normalized form to account for the scaling effect of character size. 
This can be applied to Vistushin f s concept of e-entropy 1 using instead: 


e * = 


w 


hv 

w 


IT 

hv 


By also restricting the stochastic process to be binominally distributed 
with parameter P[B] and using a given scan pattern, the e*-entropy is 
then : 


t p[b] * € *> = £ *"P a e e entr °py 


bits/page (3.4-21) 


For a single-element scan pattern, [X], and P[B] = 0.5, 

Eq. 3.4-21 degenerates to the basic e*-entropy for the binary stochastic 
process (Eq. 3.4-1): 


[T q 5 (e*)]j- x] = N(e*) bits/page 

This maximum page entropy, when relative to e*-entropy at the Nyquist 
interval for strokes, defines a coefficient of expansion: 
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The entropy for one 
page) becomes : 

npCB] 5 


symbol from the alphabet (averaged over the 
= e*-symbol entropy bits/symbol (3.4-22) 


with corresponding compression: 


°P[B] (e * ) " 


np[B] 


TZ*J 


= € -symbol compression 


symbols/bit (3.4-23) 


A coefficient of compression is just defined by dividing through by the 
compression at £ (i.e., f*^ = 1) • 




C P[B] (1) 


coeff. of symbol compression 


For the parameters related to a specific scan pattern, a mea- 
sure embodying all the changes with resolution can now be defined as: 

7 c (e*> 

•n(e*) = — -> ' -H = resolution efficiency', (3.4-24) 

7 Er ; \ 

Notice the boundary condition at e*= 1 that: 

T|(l) = 1 
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By using the value for e*-symbol entropy at e*= 1 and Eq. 
3.4-24, the following identities can now be used to define the basic 
resolution-dependent parameters: 




T 

P[B] 


(£*) 



bits (3.4-25) 


(3. 4-26) 


T P[B] (1) 


bits (3.4-27) 


The efficiency characteristic will fall off sharply for poor patterns. 
Conversely, for good patterns it will fall slowly; and for perfect pat- 
tern recognition: 

W e *> = 1 

To illustrate this, consider the following example. At a 
resolution of f * v = 2.0, the 4^ order Markov pattern (3 rci row, Table 
3.3-1) has compression: 


C = 10.80 

At a resolution of f* =1.25 The non- causal pattern (S** 1 row, Table 
3.3-1) has compression: 


C = 10.77 

If compression values alone were reported, one would perhaps assume the 
patterns to be equivalent (or the 4"^ order Markov pattern even to be 
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better'.). For a fair comparison of the patterns, the same resolution 
should be used. Thus, for f* = 1.25, the 4th order Markov pattern 
yields : 

C = 5.80 

which is almost half the compression for the non- causal pattern. 

Furthermore, at f*^ = 1, the e*-ef f iciency is unity for all 
patterns. But at f*^ = 1.25, for the 4 t -* 1 -order Markov pattern: 

T](e*) =0.78 

while for the non-causal pattern: 

T)(e*) = 0.85 

Notice how efficiency for the better pattern drops more slowly as normal- 
ized resolution increases from the value ff = 1. This is the measure 

hv 

of a pattern's performance. As Eq. 3.4-26 illustrates, the greater com- 
pression accompanying high resolution is discounted by rj(e*) for com- 
parison with the performance at e*=l.\ 

In Fig. 3.4-1, the two measures of page information — legibility 
and entropy — are compared for a particular font and "scan" pattern. Actu- 
ally, illegibility is plotted to emphasize the tradeoff encountered in a 
system design. If cost functions with resolution are available for both 
parameters, their cost-weighted superposition can now be computed. Mini- 
mizing this total cost function will yield the optimal operating resolution 
for the system. 
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Chapter IV 


POSSIBLE EXTENSIONS 


Areas for further study may be found in the process of sampling and 
reconstructing digital images, in the measurement of document quality, 
in devising scan patterns with lower entropy, and in the implementation 
of codes for these scan patterns. 

the effect of apertures shaped other than rectangular, has not been 
evaluated for its impact on printed matter. Black/ white decision levels 

other than 50 percent of area integration could be explored, along with 
the feedback necessary to match 0 and 100 percent levels to the black 
and white on incoming documents. Quantizing two-level input data with 
multiple levels (grey-scale) may be explored for improving quality. 

Objective quality measures are needed to evaluate grey-scale images. 
The desirability of having visible stair-steps in rectangularly recon- 
structed images, needs to be evaluated also. Round print elements may 
be an improvement due to the absence of sharp corners. 

The possibilities for scan patterns have by no means been exhausted. 
Two-dimensional equivalents to "runs" need to be investigated. The lit- 
erature in pattern recognition should be monitored for results that may 
carry over to image compression. Another related area is in optical 
image processing. 

Clever implementations may be found for encoding good source alpha- 
bets, or good coding schemes may specify source alphabets to be tested. 
Adaptive coding schemes may also be investigated for simplicity, good 
€-compression, and high resolution efficiency. 

In general, searching through fundamental problems related to future 
image processing systems may turn up rewarding areas for further study. 
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APPENDIX 


A. 1. Experimental System Description 


The block diagram in Fig. A. 1-1 summarizes the author's experi- 


mental system, used to simulate the processing of digital images. 


TEST 

DOCUMENTS 

INPUT 


ERRORS- 


SCANNER 

~r~ 

RESOLUTION 


DOCUMENT 

LIBRARY 


1 




CPU 


INTERFACE 


PRINTER 

~ r 

RESOLUTION 


TEST 

DOCUMENTS 

OUTPUT 


Fig. A. 1-1* EXPERIMENTAL IMAGE- PROCESSING SYSTEM 


Input-output is through a facsimile scanner/printer, modified 
to permit two-dimensional, variations in resolution. An IBM 1620 is at- 
tached through interface electronics, to do the simulation and to permit 
image storage on 1311 disc packs. These digital images can then be trans- 
ferred to tape and preserved in a document library for future study. When 
simulation of compression coding, transmission errors, and decoding is 
complete, images are put out on the printer for evaluation of their 
quality. 

The mechanical scanner/printer can be seen in Fig. A. 1-2. On 
the left side, a test document is mounted on a drum under scanning optics. 
As the drum rotates, the carriage with optics moves slowly to the right 
due to the "pitch" of a lead screw. This combined motion produces line- 
by-line scanning (which is vertical for a document oriented as shown). 

On the right side, an output document can be seen under an elec- 
trostatic print stylus. After charge deposition, output documents are 
placed in a bath of liquid toner and "fixed" on a hot plate. Electro- 
static printing is used to simulate hardcopy output for a practical 
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Fig. A. 1-2. MECHANICAL SCANNER/ PR INTER. 

system. This process is carefully controlled, to maintain consistent 
output quality for experiments measuring the effects of resolution. 

Although it is possible to transmit real-time from scanner to 
printer, this mode is avoided (except for testing) . The synchronization 
errors between the CPU and mechanical I/O must be simulated. Interface 
clock rates are slaved to the drum, using signals from an optical clock 
track for frequent reset of a free- running multivibrator. 

To digitize the image, area integration is first performed, 
followed by sampling and a 50 percent decision level. This results in a 
two-level dissection of the document into rectangular binary "elements,” 
(illustrated in Fig. 2.1-1 of the text). 

Area integration transverse to the direction of scan is mechani- 
cal. Five interchangeable scanning slits are used, one for each trans- 
verse resolution. Scanning-slit substitutions are accompanied with 
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matching changes in the lead screw drive. Parallel to the scan, integra- 
tion is electronic; with sample and squelch rates changeable to also ob- 
tain five different resolutions. Thus, 25 rectangular scan and patch 
resolutions are available. (Their values are specified in the tables of 
Appendix A. 3.) 

For the printer, five print heads are used to change transverse 
resolution (along with the same lead screw drive shared with the scanner) . 
Resolution parallel to the scan is again determined in the electronics. 

The printing is done using a contact stylus, continuously "on" for con- 
tiguous black elements. 

The position accuracy achieved in scanning and printing was 
theoretically estimated to be ±0.5 mil. Visual measurement from output 
data was limited to ±0.5 mil due to fuzzy edges from printing electrostat- 
ically. Such measurements, however, tended to confirm that the desired 
resolution values had been achieved. (See Fig. A. 1-3) . Since the same 
hardware is shared for scanning and printing, these measurements apply as 
well to the scanning process. 
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These visual measurements were made from microphotographs 
taken of the same region in each test document (Fig. 2.1-1 in the text, 
again, illustrates one such picture) . To measure positional accuracy, 
the width of a "step” is used (to cancel out edge effects). To isolate 
a pair of edge effects, the width of a one bit "stroke" is measured. 

Then the difference between "step" and "stroke" widths indicate the 
amount of two edge effects. 

Second order effects were present due to the electrostatic 
printing process. It had been observed that in printing, black areas 
ended up slightly larger after charge deposition and toning. To compen- 
sate for this in the pitch direction, slightly narrower print heads were 
used. This compensation was insufficient however, since visual measure- 
ments indicate an average of 0.7 mil widening in the pitch direction at 
each black edge. 

In the scan direction, a single-shot was used to delete a fixed 
amount of print voltage "on" -time, for every black run. This was to com- 
pensate for the "spread" effect at the two edges of a black run. In 
practice, measurements indicate that an average of 0.5 mil was lost at 
every black edge due to over-compensation for "spread" in the scan 
direction. 

These second order printing variations should have small effect 
on legibility. In all cases the desired dot to be printed was present. 
Only its shape was distorted from square to rectangular. The same ap- 
plies to variations in toning. Cases where a print dot is visible but 
lighter in color should have only a second-order effect on legibility. 

Images stored in the document library were subsequently used 
for the entropy studies in Chapter III. (Fig. A. 1-4) . An IBM 360 model 


DOCUMENT 

LIBRARY 



-MARKOV DISTRIBUTIONS 
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40 was used to estimate the Markov distributions, and compute statis- 

tics, for these images. Different source alphabets could be readily pro- 
grammed and entropy measurements made. 

A. 2. Test Documents 

The first two sets of documents, A and B, were designed by human 
factors engineers R, L. Erdman, and A. S. Neal, associates at IBM. These 
test documents played an integral part in their measurements of document 
legibility, to evaluate throughput from the experimental system [Arps, 
et al - 1966]. The raw data from these legibility measurements is sum- 
marized in Appendix A. 3. 

The test document in Fig. A. 2-1 was used not only for basic 
legibility measurements, but was useful in tuning up the experimental 
system. The upper part with four sizes of type constitutes document Al. 

Figures A. 2-2 through A. 2-6 represent five sizes of Dual-Gothic 
type, aligned vertically as well as well as horizontally. This font was 
selected for its uniformity and stroke width as well as lack of serifs. 

Documents Cl and C2 were from a random selection (picked by a 
secretary just requested to type a page full of text) . This sample con- 
tains some 1400 letters. In Fig. A. 2-7, its letter frequencies are com- 
pared with larger random samples by [Pratt - 1942] and [Dewey - 1923]. 

The documents were generated on a typewriter capable of changing 
fonts. In this way, the position of characters in fhe two documents were 
identical. Document Cl was made using common Pica type (Fig. A. 2-8). 

Then, for comparison with documents Bl-5, Document C2 was made using 
Dual-Gothic type (Fig. A. 2-9). 
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KARHXCDVUB 
AU2 VB8NCGE 
UGDCEKX853 
G5N83ABKR0 
5 RXKOUE A2Z 
R2BAZG3UDS 
2DEUS50GNH 
DN3GHRZ5XV 
NX05V2SRBC 
XBZRCDH2E8 

BES28N VD3K 
E3HDKXCNOA 
30VN AB8XZU 
OZCXUEKBSG 
ZS 8BG3AEH5 
SHKE50U3VR 
H VA3RZG0C2 
VCU02S5Z8D 
C8GZDHRSKN 
8K5SNV2HAX 


OGS3Z8EN25 
Z5HOSK3XDR 
SRVZHAOBN2 
H2CSVUZEXD 
VD8HCGS 3BN 
CNKV85HOEX 
8XACKRVZ3B 
KBU8A2CSOE 
AEGKUD8HZ3 
U35AGNKVSO 

GORU5XACHZ 
5Z2GRBU8VS 
RSD52EGKCH 
2HNRD35 A8V 
DVX2NORUKC 
NCBDXZ2GA8 
X8ENBSD5UK 
BK3XEHNRGA 
E AOB3VX25U 
3UZEOCBDRG 


j cn vxaeueb 
nkeuao i osv 
eh i ooj aeru 
i tae j n I sxo 
ab I snecrae 
I vcre i kxos 
cukxiahajr 
kohaa I tonx 
heto I cb jea 
tsb j ckvn i o 

b rvn khuea j 
vxueh+o i I n 
u ao i tbeace 
ooeabvs I k i 
e j s I v u rcha 
snrcuoxkt I 
rexkoeahbc 
x i aheso+vk 
aaotsrjbuh 
o I j b rxn vot 


h I i ootskar 
tea jeb rh I x 
b k i n svxtca 
vhceru ab ko 
ut k i xoovh j 
obh aae j u +n 
evt I osnobe 
subcj reev i 
rovknx i sua 
xeuheaarol 

asot i o I xec 
orebaj cask 
jxsv I nkorh 
naruceh j xt 
eoxok i tnab 
i jaehabeov 
anost I v i j u 
I e j rbcuano 
c i nxvko I ee 
kaeauhec i s 


Fig. A. 2-3. DOCUMENT B2 : DUAL-GOTHIC TYPE, 10.0 MIL STROKE WIDTH. 


5K8HU3DGOB RN AS VX2CEZ 
RAKVG0N5ZE 2XUHCBD83S 
2UAC5ZXRS3 DBGV8ENK0H 
DGU8RSB2H0 NE5CK3XAZV 
N5GK2HEDVZ X3R8A0BUSC 
XR5 A0V3NCS B02KUZEGH8 
B2RUNC0X8H EZDAGS35VK 
ED2GX8ZBKV 3SNU5HORCA 
3N05BKSE AC 0HXGRVZ28U 
0XNREAH3U8 Z VB52CSDKG 

ZBX2 3UV0GK SCERD8HNA5 
SEBD0GCZ5 A H832NKVXUR 
H3ENZ58SRU VK0DXACBG2 
V03XSRKH2G CAZNBU8E5D 
CZ0BH2 AVD5 8USXEGK3RN 
8SZE VDUCNR KGHB35 A02X 
KHS3CNG8X2 A5 VEORUZDB 
AVH08X5KBD URC3Z2GSNE 
UCVZKBRAEN G280SD5HX3 
G8CS AE2U3X 5DKZHNRVB0 


nukojeblha vexoi+srea 
eohensve+o ukajabrxil 
ietserukbj ohonlvxaac 
asbri xohvn e+jecuaolk 
I rvxaae+ue sbnikoojch 
cxualosbol rveahejnkt 
kaoocjrvjaa xuiltsnehb 
hoejknxusl aoaebrei+v 
tjsnhaaorc oalkvxiabu 
bnretloexk jschuaalvo 

vexibajsah nrktoolcue 
uiaavlnrot exhbejekos 
oaolucaxjb latvsnkher 
eljcokianv aoburah+sx 
senkehaoeu IJvoxitbra 
rkehs+ljio cnueaabvxo 
xhi+rbenae keosolvuaj 
atabxvkels hferjcuoon 
oblvauhicr tasxnkoeje 
jvcuootakx blraehesni 


Fig. A. 2-4. DOCUMENT B3: DUAL-GOTHIC TYPE, 8.0 MIL STROKE WIDTH. 
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BGCS2VUZE3 
E58HDCGS30 
3RKVN85HOZ 
02ACXKR VZ S 
ZDU8BA2CSH 
SNGKEUD8HV 
HX5A3GMKVC 
VBRU05XAC8 
CE2GZRBU8K 
83D5S 2EGKA 

KONRHD35AU 
AZX2 VNORUG 
USBDCXZ2G5 
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CBHXESD853 
8EVA3HNKR0 
K3CU0 VXA2Z 
A08GZCBUDS 
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3HU2VA0RBC 
RVG0CUZ2E 8 
2C5N 8GSD3K 
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netcxho lei 
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hnroas j u tb 
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b i ascxeevu 
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of vscbeeok 


Fig. A. 2-5. DOCUMENT B4: DUAL-GOTHIC TYPE, 

6.0 MIL STROKE WIDTH, 


Fig. A. 2-6. DOCUMENT B5: DUAL-GOTHIC TYPE, 

4.0 MIL STROKE WIDTH. 


KZXG600VUN 
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PERCENT USAGE 



1.4 K ARPS & 
4.5 K PRATT o 
438.6 k DEWEY □ 


J KLMNOPQ RSTUVWXY Z 
ENGLISH ALPHABET 


Fig. A. 2-7. LETTER FREQUENCY FOR DOCUMENTS Cl AND C2. 
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Fig. A. 2-8. TEST DOCUMENT Cl; PICA TYPE, 13.0 MIL STROKE WIDTH, 



The following cautions may seem unimportant to some; but it is 
surprising how many shorthand writers sidestep these definite aids 
to efficiency. Date the notebook every day. Much needless search- 
i no throuqh notebooks has been caused by neglect to do this. 
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Fig. A. 2-9. TEST DOCUMENT C2: DUAL-GOTHIC TYPE, 10. 0 MIL STROKE WIDTH, 







A. 3. Human Factors Data on Legibility vs Resolution [Arps, et al - 
______ 


To cancel out systematic machine errors due to the direction of 
scan, all resolution combinations were tested twice, once with a document 
scanned vertically and again with a horizontally scanned document. The 
variations due to scan direction were not significant, so that observa- 
tions were combined for pairs of documents at the same resolution combi- 
nation. (These combinations are specified as "horizontal" and "vertical" 
with respect to the lines of type on a document) . 

Legibility results are given in Tables A. 3-1 and A. 3-2 for 
upper- and lowercase Mid-Century type respectively. These values are 
for 4 sizes of type at all 25 resolution combinations. They were each 
estimated using the sample mean from 80 observations (where 80 subjects 
read 10 characters apiece): 


where : 


L. (f. , f ,w ) 
x h’ v' s 


— / L. .(f , f , w ) 
n Z-f xj h v s 


L„ (f^, f v> w g ) = an observation of legibility ^ 

L^f^f ,w ) = the sample mean value of legibil- 
ity at one resolution combination <fo 

f. - horizontal spatial frequency elem./inch 

h 

f = vertical spatial frequency elem./inch 

v 

w = character size in terms of 

stroke width mils 

n = 80 = number of observations 


The standard deviation of the sample mean was also estimated for 
each of the resolution combinations using the unbiased statistic: 


n 


S L = 

x 


n(n-l) 


ItVVW - 


j=l 


L. (f , f , w ) ] 
x h’ v w 


mils 
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The standard deviation for the measurements of uppercase Mid-Century type 
are given in Table A. 3-3. 

Note that in reading the entries from any box of a Table, they 
correspond to character sizes in descending order. For example, in the 
upper left box of Table A. 3-1, the four entries of percent legibility are 

92.5 for 15 mil stroke-width (114 mils high) 

38.25 for 10 mil stroke-width (76 mils high) 

5.25 for 7,5 mil stroke-width (59 mils high) 

0.0 for 5.0 mil stroke-width (38 mils high) 

Table A. 3-4 gives legibility values for upper- and lowercase 
Dual-Gothic type. These values were measured at only one resolution 
combination, 8.0 x 8.0 mils/elem. (equivalent to the center box in 
Figs. A. 3-1 and A. 3-2). However, additional data is given for the vari- 
ous sizes of type corrupted by a model for transmission errors. 

Table A. 3-5 gives corresponding standard deviation values for 
the Dual-Gothic legibility data. 
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VERTICAL RESOLUTION 


Table A. 3-1 


UPPERCASE, MID- CENTURY TYPE (FOUR SIZES) 

LEGIBILITY AT HORIZONTAL VS VERTICAL RESOLUTION COMBINATIONS 
FOR 15.0, 10.0, 7.5, AND 5.0 MIL STROKE-WIDTHS 


HORIZONTAL RESOLUTION 


• 43 

U o 

<D fl 

H W 

W \ 

W S 
H 0) 
•H rH 

: S H 
a a 

•H »H 


> > 
*D <H 


>> 

0 
SJ 

1 


w a> 

•H P 
Q Pn 


a a 

•H *H 
-P -P 
os CO 
p. a 

w co 


Spatial Distance, d^, in Mils/Elem. 
(Spatial Frequency, f h , in Elem./lnch) 



13.3 

10.0 

8.0 

6.7 

5.0 


(75) 

(100) 

(125) 

(150) 

(200) 


92.5 

94.625 

96.75 

97.625 

96.375 

13.3 

38.25 

69.25 

80.0 

90.0 

86.0 

(75) 

5.25 

20, 75 

42.75 

55.125 

54.50 


0.0 

0.25 

0.25 

3.0 

1.75 


96.125 

97.0 

99.5 

99.625 

99.125 

10.0 

73.875 

90, 25 

94.25 

94.50 

97 . 625 

(100) 

30.375 

72,875 

84.625 

89.25 

89.875 


0.0 

0 . 625 

10.5 

30.5 

35.875 


95 . 625 

98 . 875 

99.25 

99.125 

99.875 

8.0 

79.75 

95.125 

96 .75 

96.625 

98.875 

(125) 

36.25 

85 .0 

90.5 

95.625 

94.25 


1.0 

6.5 

31,25 

54.5 

59.0 


97.75 

99.75 

99,625 

100.0 

99.875 

6.7 

84.0 

95.125 

98.5 

99.375 

99.125 

(150) 

47.875 

89.5 

93.0 

94.5 

95.875 


3.5 

23.375 

50.75 

72.125 

83.875 


98,25 

99,875 

99.625 

, 99.875 

100.0 

5,0 

85.75 

98,375 

98.875 

98.75 

99.75 

(200) 

54.25 

88.125 

92.375 

95.5 

97.875 


3 . 375 

22.0 

44.75 

81.0 

86.75 
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VERTICAL RESOLUTION 


Table A. 3-2 


LOWERCASE, MID-CENTURY TYPE (FOUR SIZES) 

LEGIBILITY AT HORIZONTAL VS VERTICAL RESOLUTION COMBINATIONS 
FOR 15.0, 10.0, 7.5, AND 5.0 ■'MIL STROKE-WIDTHS 


HORIZONTAL RESOLUTION 

Spatial Distance, d h , in Mils/Elem. 
(Spatial Frequency, f h , in Elem./lnch) 


13.3 

(75) 

10.0 

(100) 

8.0 

(125) 

82.5 

87.75 

95.0 

14.25 

44.325 

67.25 

0.75 

6.875 

11.625 

0.0 

0,25 

0.25 

92 . 125 

95,125 

98.5 

55.5 

80.5 

89 . 375 

13.5 

44.375 

62.625 

0.125 

0.875 

4.0 


95.25 

8.0 55.5 

(125) 21.75 

0.125 


98.625 

89.125 

55.375 

2.0 


§ § 
+-> c 

W 0 
•H &4 

O to 

H »H 
03 03 

6.7 

(150) 

95.875 

68.375 

27.25 

0.25 

99.125 

93.375 

71.25 

7.25 

•H *H 
-P 4-> 
03 


97.0 

99.25 

pu gx 

CO CO 

5.0 

64.75 

94.125 

w 

(200) 

31.75 

72.0 



0.5 

3.75 


99.125 

93.25 

80.25 

15.125 

99.125 

97.375 
88.625 

26.375 

99.25 

97 . 375 

92.375 

21 . 375 


6.7 

5.0 

(150) 

(200) 

94.25 

98.25 

70.75 

67.875 

29.75 

19.375 

2.25 

1.375 

99.25 

99.75 

92.0 

94.75 

73.875 

77 . 375 

9.0 

15.125 

99.625 

99.125 

97 . 375 

98.625 

83.875 

94.25 

22.25 

33.0 

99.625 

99.5 

98.25 

98 . 75 

87.125 

96. b 

47 . 375 

60.125 

99.625 

99.0 

98.5 

99.5 

95 . 375 

99 . 375 

53.5 

67.0 
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VERTICAL RESOLUTION 


Table A. 3-3 


UPPERCASE, MID-CENTURY TYPE (FOUR SIZES) 

STANDARD DEVIATION AT HORIZONTAL VS VERTICAL RESOLUTION COMBINATIONS 
FOR 15.0, 10.0, 7.5, AND 5.0 MIL STROKE-WIDTHS 


HORIZONTAL RESOLUTION 

Spatial Distance, d^, in Mils/Elem. 
(Spatial Frequency, f^, in Elem./lnch) 


10.0 

( 100 ) 


8.0 

(125) 


6.7 

(150) 


5.0 

( 200 ) 


6.264 

21.275 

9.274 

0.0 

6.740 

20.975 

19.859 

1.571 

5.460 

19.093 

27.141 

1.571 

4.837 

12.629 

26.908 

6.825 

5.335 

17.760 

25.052 

6.517 


10,0 

( 100 ) 


8.0 

(125) 



5.0 

( 200 ) 


6.038 

6.721 

5.025 

4.975 

14.275 

11.904 

10.862 

12.443 

19.320 

18.581 

18.940 

22 . 263 

0.0 

3.019 

7.647 

6.925 

6.038 

3.556 

1.571 

1.118 

9.274 

6.363 

7.462 

4,039 

18 . 500 

12.729 

12.917 

11.034 

2,436 

11 . 374 

18 . 755 

17.603 

2.193 

3.091 

1.911 

1.911 

6.319 

5.223 

3.593 

3.556 

14.048 

9.264 

8.916 

11.933 

16.374 

20.337 

25 . 347 

22.331 

1.912 

3.258 

0.0 

1.118 

6.329 

5.725 

2.436 

3.328 

9.908 

7.605 

8.845 

6.917 

23.216 

23.106 

15.236 

16.809 

3.258 

1.118 

1.118 

0.0 

4.837 

3.179 

2.843 

1.571 

9.345 

7.593 

6.880 

4.691 

22.258 

23.090 

14.539 

10.997 
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Table A. 3-4 


LEGIBILITY, DUAL-GOTHIC TYPE (FIVE SIZES) 
UPPER- AND LOWERCASE, WITH AND WITHOUT ERRORS 
FOR 12.0, 10.0, 8.0, 6.0, AND 4.0 MIL STROKE-WIDTHS 



Upper-Case 

Lower-Case 


99.206 

99.575 

Without 

99.175 

99.331 

Errors 

95.394 

98.069 

75.019 

85.800 


28.462 

37.712 


98.337 

98.625 

With 

98.887 

95.944 

Errors 

88.137 

89.031 


55.319 

58.812 


13.531 

22 . 269 


HORIZONTAL AND VERTICAL RESOLUTION 

djj = d v = 8.0 mils/elem. 

(fh = f v = 125 elem./inch) 


Table A. 3-5 

STANDARD DEVIATION, DUAL-GOTHIC TYPE (FIVE SIZES) 
UPPER- AND LOWERCASE, WITH AND WITHOUT ERRORS 
FOR 12.0, 10.0, 8.0, 6.0, AND 4.0 MIL STROKE-WIDTHS 



Upper-Case 

Lower-Case 


2.101 

1.505 

Without 

2.113 

2.040 

4.471 

3.273 

Errors 

10.307 

9.240 , 


12.147 

17.461 


3.084 

2.682 

With 

4.424 

5.058 

9.246 

Errors 

9.450 

16.714 

16.821 


9.635 

- 

13.782 


HORIZONTAL AND VERTICAL RESOLUTION 

d h = d v = 8.0 mils/elem. 

(f h = f v = 125 elem./inch) 
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